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(57) Abstract 

The invention relates to new expression systems and in particular to an expression system in which a gene of interest is expressed 
at an optimal level. The invention provides a recombinant expression vector comprising a gene of interest and a selectable marker gene, 
wherein the selectable marker gene is arranged downstream of the gene of interest and a stop codon associated with the gene of interest is 
spaced from a start codon of said selectable marker gene at a distance which is sufficient to ensure that translation reinitiation is required 
before said selectable marker protein is expressed from the corresponding MRNA. Examples of such expression systems are vector viral 
packaging cell lines and a number of preferred cell lines have been identified. 
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Expression systems 



The present invention relates to new expressions systems, 
5 and in particular to expression systems in which a gene of 

interest is expressed at an optimal level. Particular 
examples of such expression systems are retroviral packaging 
cell lines and a number of preferred cell lines have been 
identified. 

10 

The ability of eukaryotic and prokaryotic ribosomes to 
reinitiate translation at an internal start codon within an 
mRNA sequence has previously been recognised. Studies have 
been reported in which the efficiency of the process, which 

15 is generally regarded as being low, has been connected with 

the length of the intercistronic sequence (Kozak (1987) 
Mol. Cell Biol. 7, 3438-3445) . Selection of this sequence 
or spacer as 70bp in length, and containing no other start 
codons, has been previously reported as being optimal for 

20 reinitiation in a eukaryotic cell line (Cosset F-L. , 

Virology (1991) 185, 862) . 

The applicants have found a way in which the inefficiency 
associated with the translation reinitiation process can be 
25 used to good effect. 



According to the present invention there is provided a 
recombinant expression vector comprising a gene of interest 
and a selectable marker gene, wherein the selectable marker 

3 0 gene is arranged downstream of the gene of interest and a 

stop codon associated with the gene of interest is spaced 
from a start codon of said selectable marker gene at a 
distance which is sufficient to ensure that translation re- 
initiation is required before said selectable marker protein 

3 5 is expressed from the corresponding mRNA. 
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The invention further provides a process for producing cell 
lines m which a gene of interest is expressed, which 
process comprises transforming host cells with an expression 
vector comprising said gene of interest and a selectable 
marker gene, wherein the selectable marker gene is arranged 
downstream of the gene of interest and a stop codon 
associated with the gene of interest is spaced from a start 
codon of said selectable marker gene at a distance which is 
sufficient to ensure that translation re-initiation is 
required before said selectable marker protein is expressed 
from the corresponding mRNA, and selecting those cells where 
expression of the selectable marker gene may be detected. 

Since re-initiation of translation is a relatively 
inefficient process, this means that the selectable marker 
protein will be expressed at lower levels than the product 
of the gene of interest, when the marker protein is 
expressed at detectable levels, the gene of interest will be 
expressed at higher levels. This will ensure that during 
the subsequent selection procedure, only those cell clones 
which express the gene of interest at higher or optimal 
levels will survive. Low expressing clones will be 
eliminated by the selection process. 

Cells transformed with the above -described expression 
vectors form a further aspect of the invention. 

The host cells are suitably eukaryotic or prokaryotic host 
cells, preferably eukaryotic host cells. 

The number of nucleotides in the space between the stop 
codon of the gene of interest and the start codon of the 
selectable marker will suitably be in the range of from 20- 
200 nucleotides, preferably from 60-80 nucleotides, even 
more preferably 70-80 nucleotides. 
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3 

The vectors used in the process of the invention may be any 
of the known types, for example expression plasmids or viral 
vectors . 

5 Selected cells may be cultured and if required, the protein 

product of the gene of interest isolated from the culture 
using conventional techniques. Alternatively, expression of 
the gene of interest may result in other desired effects, 
for example, where the gene of interest is included as part 
10 of a viral packaging construct. 

Some experimental and clinical gene transfer protocols 
require the design of gene transfer vectors suitable for in 
vivo gene delivery (Miller, A.D . 1992. Nature 357:455- 

15 460) . Retroviral vectors are attractive candidates for 

such applications, because they can provide stable gene 
transfer and expression (Samarut J. et al.,Meth. Enzymol . in 
press) and because packaging cells have been designed which 
produce non-replication competent viruses (Miller A.D (1990) 

20 Hum Gene Ther. 1 5-14) . However currently available 

recombinant retroviruses suffer from a number of drawbacks. 

Packaging cell lines provide in trans the retroviral 
proteins encoded by the gag , pol , and env genes required to 

25 obtain infectious retroviral particles. The gag and pol 

products are respectively the structural components of the 
virion cores and the replication machinery (enzymes) of the 
retroviral particles whereas the env products are envelope 
proteins responsible for the host -range of the virions and 

30 for the initiation of infection and for sensitivity to 

humoral factors. An ideal packaging cell line should 
produce retroviruses that only contain the retroviral vector 
genome, and absolutely no replication-competent genomes or 
defective genomes encoding some of the viral structural 

» 

3 5 genes. 
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A number of packaging cell lines designed for human gene 
transfer have been designed in the past by introducing 
plasmid DNAs which contain "helper genomes" encoding gag, 
pol and/or env genes into cells. 

Retroviral packaging cell lines are cells that have been 
engineered to provide in trans all the functions required to 
express infectious retroviral vectors. A helper genome (or 
construct or unit), is herein also referred to as 
"retroviral packaging construct (or unit).'' or "packaging- 
deficient construct (or genome unit)" or "gag-pol/env 
expression plasmids" . 

Much efforts has been made to design strategies to optimize 
the helper-genomes in order (i) to get the highest 
production of retroviral packaging functions (which 
correlates which infection titers of retroviral particles) 
and (ii) to minimise the chance that the helper genome can 
be transmitted via the viral particles (which may lead to 
emergence of unwanted retroviral forms) . 

The first of these packaging cell lines used full length 
retroviral genomes as helper genomes that had been crippled 
for important cis-regulated replicative functions (reviewed 
m Miller, Hum. Gene. Ther. 1:5-14 1990). In order to reduce 
the possibility of occurrence of replication-competent 
viruses and of transfer of virus structural genes, a second 
generation of safer packaging cell lines has been designed 
by using two separate and complementary helper genomes which 
express either gag-pol or env and are packaging-deficient 
(Miller supra) . 

The cells into which these helper genomes were introduced 
were isolated by cotransf ecting them with plasmids encoding 
selectable markers. However, as no selection was applied on 
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the packaging-deficient retroviral genome itself, the" helper 
functions can be lost during the passages of the cells in 
culture and the current packaging systems provide limited 
titers of infectious retroviral vectors, usually only of the 
order of 10 5 -10 6 infectious units i.u/ml. Indeed the 
cotransfection with a plasmid encoding a selectable marker 
does not directly select the best gag-pol-env- expressing 
cells. 



10 



15 



20 



25 



The invention further provides a retroviral packaging cell 
line comprising a host cell transformed with (i) a packaging 
deficient construct which expresses a viral gag-pol gene and 
a first selectable marker gene, and/or (ii) a packaging- 
deficient construct which expresses a viral env gene and a 
second selectable marker gene; wherein a start codon of the 
first and second selectable markers are spaced from the stop 
codons of the viral gag-pol gene and the viral env gene 
respectively by a distance which ensures that reinitiation 
of mRNA translation is required for expression of marker 
protein product of said first and/ or second selectable 
marker gene . 

The retroviral packaging cell line may be obtained by the 
above described process which will involve selecting 
transfected cells which express said first and/or second 
marker genes. 



By using helper constructs which are directly selectable and 
which provide for high expression of the viral gene, high 
30 titre retroviral vectors may be obtained. 



Helper constructs for use in the process form a furtn 
aspect of the invention. 

The retroviral vectors prepared from the conventional 
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packaging cell lines are usually not contaminated by " 
replication-competent retroviruses (RCRs) . However, 
recombinant amphotropic murine retroviruses have been shown 
to arise spontaneously from certain packaging cell lines. 
The generation of such RCRs involves recombination at least 
between gag-pol/env packaging sequence and vector sequences 
(Cosset et al., Virology, (1993) 193:385-395). 



Recombinant RCRs have been associated with the development 
of lymphomas in some severely immunosuppressed monkeys 
(Donahue et al., J. Exp Med (1992) 176: 1125-1135). In 
addition, retroviral vector preparations may also contain, 
at low frequencies, retroviruses coding for functional 
envelope glycoproteins (Kozak and Kabat, 1990, J. Virol. 64: 
3500-3508) or for gag-pol proteins. Although the 
pathogenicity of these gag-pol or env recombinant 
retroviruses is probably low, more evolved recombinant 
retroviruses with higher pathogenic potential may occur when 
injected in vivo, by recombination and/or complementation of 
the initial recombinant viruses with some endogenous 
retroviruses . 

In a preferred embodiment of the retroviral packaging cell 
lines of the invention, the overlapping sequences between 
the genomes of the retroviral vector and the helper 
construct are reduced, for -example as compared to constructs 
such as CRIPenv and CRIPAMgag (Danos et al . , Proc . Natl. 
Acad. Sci USA 85: 6460-6464). In particular, the viral 
sequences in the helper construct are reduced, for example, 
not only the packaging sequence but also the 3' Long 
Terminal Repeat (LTR) , the 3' non-coding sequence and/or the 
5 ' LTR may be eliminated. 

The possibility of generation of such RCRs and recombinant 
retroviruses can be reduced by reducing the overlapping 
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sequences between the genomes of both the retroviral vector 
and the helper construct. 

Conventional retroviral vectors are strongly inactivated by 
5 human serum which makes them of limited or no use for in 

situ gene transfer in gene therapy applications. It has 
previously been shown that inactivation by complement in 
human serum is controlled by the cell line used to produce 
the virions and by viral envelope determinants (Takeuchi et 

10 al., J. Virol (1994) 68:8001-8007). In particular, 

inactivation is caused by some properties of the cell lines 
that have been used to construct the packaging cells (NIH- 
3T3) and also by viral determinants located in the 
retroviral envelope as shown (Takeuchi et al . , J. Virol 

15 (1994) 68:8001-8007) . In vivo gene delivery is an important 

goal for a number of human gene therapy strategies. 

The applicants have found that certain cell lines form 
preferred packaging cell lines. 

20 

Particularly preferred packaging cell lines are the HT1080 
line, the TE671 line, the 3T3 line, the 293 line and the Mv- 
1-Lu line. One example of retroviral packaging cells that 
will produce complement -resistant virus comprise human 
25 HT1080 cells and express RD114 envelope. Such cells form a 

preferred aspect of the invention. 

Packaging cell lines according to the invention provide 50- 
100 fold increased titers of retroviral vectors as compared 
to conventional packaging cell lines. Retroviral vectors 
provided by these new cells are safe, in terms of generation 
of RCRs, and considerably more resistant to inactivation by 
human complement . 



30 



35 



Packaging cell lines according to the invention may be able 
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8 



to transduce helper-free, human complement-resistant " 
retroviral vectors at titers consistently higher than 10' 



i • u . /ml . 



Suitable semi -packaging cell lines in accordance with the 
invention are those which express only the gag-pol genes 
Such cell lines may suitably be derived from TE671, MINK Mv- 
1-Lu. HT1080, 293 or NIH-3T3 cells by introduction of 
plasmid CeB (the MoMLV gag-pol expression unit) . 

Particularly preferred expression vectors in accordance with 
the invention for use in retroviral packaging cell lines are 
those which include MLV gag and pol genes such as CeB 
Other plasmids may include gag and pol genes from other 
retroviruses or chimeric or mutated gag and pol genes . 

Various viral and retroviral envelope genes may be included 
m the plasmids such as MLV-A envelope, GALV envelope VSV-G 
protein, BaEV envelope, RD114 envelope and chimeric or 
mutated envelopes. Plasmids which include the RD114 env 
gene such as FBde 1 PRDSAF as illustrated hereinafter, provide 
one example of suitable constructs. 

The novel retroviral packaging cells described hereinafter 
have been designated FLY cells, and may be designed for in' 
vivo gene delivery. 

Considerable variations were found between the various cell 
lines screened for their ability to release type C mammalian 
retroviruses. In addition, few cell lines were able to 
produce retroviruses completely resistant to human 
complement. Based on these two criteria, human fibrosarcoma 
HT1080 and rhabdomyosarcoma TE671 cells were selected for 
optimum construction of packaging cells. 
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Other studies have shown the importance of endogenous" 
retrovirus expression in the generation of recombinant 
retroviruses from retroviral packaging lines (Ronfort et 
al., Virology, (1995), 207, 271-275, Vanin, E.F.et al., J 
Virol (1994) 68:4241-4250.). The co-packaging of an 
endogenous genome and a vector can lead to emergence of 
recombinant retroviruses (Vanin et al . , supra). 
Recombination involves template switching during reverse 
transcription of such hybrid retroviruses (Hu et al., 
Science, (1990) 250:1227) and homologies between the two 
genomes considerably enhance the frequency of reverse 
transcriptase jumps (Zhang et al., J. Virol. (1994) 68: 
2409-2414). Therefore an ideal packaging cell line should 
not express endogenous MLV-like (or type C retrovirus -like) 
15 retroviral genomes which can be packaged by type C gag 

proteins (Scadden et al . , J. Virol. (1990) 64: 424-427, 
Torrent et al . , J. Mol. Biol. (1994) 240 434-444). 



10 



20 



Packaging of human endogenous retroviral RNA was not 
detected in TELCeB and FLY packaging cells when virion 
associated RNA was analysed by RT-PCR using generic primers. 

HT1080- and TE671 derived packaging cell lines may be safer 
in this respect than those generated from NIH3T3 cells, such 
as GP+EAM12 cells, which are known to express and package 
25 sequences related to type C retroviruses (Scadden et al. 

supra) . 

To generate the FLY packaging cell lines, HT1080 cells were 
transfected with gag-pol and env expression plasmids 
designed to optimise viral protein expression. Direct 
selection for viral gene expression was achieved in 
accordance with the invention by expression of a selectable 
marker gene by re -initiation of translation of the mRNA 
expressing the viral proteins. This strategy resulted in 
3 5 packaging cell lines capable of producing extremely high 



30 
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titer viruses. Furthermore, long-term expression of " 
packaging functions can be maintained in these cells. Many 
unnecessary viral sequences were eliminated from the 
packaging constructs to reduce the risk of helper virus 
generation; indeed the final packaging cells did not produce 
helper virus, in that no replication competent virus (RCR) 
could be detected per 10 7 vector particles. 

The FLY packaging cells described herein are safer than, for 
example, psiCRIP cells, at least for generation of env 
recombinant retroviruses as is illustrated in Table 4 
hereinafter, probably because less retroviral sequences 
overlapping with the vector were present in, the present env- 
expression plasmid. Few reports have addressed the question 
of the characterization of recombinant retroviruses (RVs) 
(Cosset, F.L., et al.. Virology (1993) 193:385-395). It is 
possible that such RVs could not be detected in previous 
packaging cell lines due to lower overall titers. RVs are 
defective in normal cell culture conditions but are likely 
to evolve to replication competent viruses if they are 
allowed to replicate in cells complementing their expression 
like co-cultivated packaging cells (Bestwick et al . , Proc . 
Natl Acad Sci USA, (1988) 85: 5404-5408, Cosset et al., 
(1993) supra) . 

In preferred retroviral packaging systems according to the 
invention, RVs are eradicated for example by removal of 
viral LTRs from the packaging construct. 

Consistent with our previous studies (Takeuchi, Y., et al., 
J Virol (1994) 68:8001-8007), LacZ(RD114) and lacZ(MLV-A) " 
pseudotypes produced from HT1080 and TE67lcells were more 
resistant to human complement than LacZ(RD114) or LacZ (MLV- 
A) pseudotypes produced by 3T3 of dog cells. It was 
therefore decided to use RD114 and MLV-A env genes to 
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11 

generate recombinant virions with MoMLV cores. 

The sequence of RD114 env gene was determined and is shown 
in Figure 4. It was found to be very close to BaEV (baboon 
endogenous virus) a type C retrovirus (Benveniste, R.E.et 
al., Proc. Natl. Acad. Sci. USA (1973) 70:3316-3320; Kato, 
S.et al., Japan. J. Genet. (1987) 62:127-137) with an 
envelope gene displaying similarities to the external part 
of type D simian retroviruses (SRVs) . RD114 uses the SRV 
receptor on human cells (Sommerfelt & Weiss, Virology 
(1990) 176:58-69; Sommerfelt, M.A. et al., J Virol (1990) 
64:6214-6220) making the FLY packaging cells with RD114 
envelope capable of generating virions with different 
tropism. Retroviral vectors prepared so far for human gene 
therapy have used either MLV-A or GALV (gibbon ape leukemia 
virus) envelopes which display some similarities (Battini, 
J.L.,et al., J Virol. (1992) 66:1468-1475) and which use two 
related cell surface receptors for infection (Miller, D.G. 
et al., J Virol (1994) 68:8270-8276). Differences in tissue- 
specific expression of MLV-A or GALV receptors have been 
reported (Kavanaugh et al., Proc Natl Acad Sci USA (1994) 
91:7071-7075) - 

The invention will now be particularly described by way of 
25 example with reference to the accompanying drawings in 

which: 



15 



20 



Figure 1 . illustrates the structure and expression of CeB. 
The env gene (Xbal-Clal) of plasmid pCRIP was removed and 
30 was replaced by coinsertion of the two fragments Xbal-Sfil 

(restriction sites underlined) from pOXEnv and a Sfil-Clal 
PCR product containing the bsr selectable marker. This ' 
results in positioning the bsr start codon (shadowed) 74 bp 
downstream to the pol stop codon (bold) . 

35 
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10 



Open triangle are start codons (gag and bsr) , black " 
triangles are stop codons (pol and bsr) . The shadowed 
triangle is the start codon of env, in the same reading 
frame with that of bsr. SD and SA are the splice donnor and 
splice acceptor sites. 

Figure 2 illustrates the structure and expression of 
FbdelPASAF. 

Immediately after the stop codon of env (bold) was inserted 
a non retroviral Kasl-Ncol (restriction sites underlined) 
linker which positions the phleo start codon (shadowed) 76 
bp downstream. 

Open triangle are start codons (env and phleo ) , black 
triangles are stop codons (env and phleo ) . SD and SA are 
15 the splice donnor and splice acceptor sites. 

Figure 3 illustrates plasmids for expression of Ampho, Eco, 
RD114, Xeno, 10A1, GALV, VSV-G and FeLVB envelopes. 
All genes are expressed in the same backbone as detailed in 
fig. 2. The Bglll sites for ecotropic (MoMLV strain), 
10A1, xenotropic (NZB.1.V6 strain) and amphotropic (4070A 
strain) , the Ndel site of RD114 (SC3C strain, the BamHl site 
for both FeLVB and GALV were used as 5' ends, and linked to 
Mscl site immediately after the splice donor site in the 
25 leader of FB29 LTR. 

Figure 4 shows the sequence of the RD114 env gene (SEQ ID No 
1) . 



20 



30 Figure 5 shows the genetic structure of gag-pol constructs. 

Initiation W and termination (T) codons are shown. The 
thick dotted line below each construct shows MLV-derived 
sequences. Nucleotide positions of MLV-derived sequences are 
shown according to: Shinnick et al . (1981) (from nt 1 to nt 
6000 with deletion of the packaging signal (DY) from Ball 
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(nt 215) to PstI (nt 568), and with some further MoMLV 
sequences in both CeB and CeB DS- from nt 7676 to nt 7938. 
gag-pol and bsr genes were expressed from the same 
transcription unit using the either a retroviral promoter 
5 (Mo LTR) or a non retroviral promoter (hCMV) and non 

retroviral polyadenylat ion sequence (polyA) . Splice donor 
(SD) and acceptor (SA) sites are indicated. The thin line 
denotes retroviral non coding sequences. The thick line 
shows the rabbit beta-1 globin intron B . The position of 
10 some restriction sites is indicated. 

The nucleic acid sequences of portions of constructs (as 
shown in Figure 5 (boxed areas)) are displayed for CeB (SEQ 
ID No 2, Figure 6), hCMV+intron (SEQ ID No 3, Figure 7) and 
15 hCMV+intronkaSD (SEQ ID No 4, Figure 8) . 



The nucleic acid sequences of portions of constructs (as 
shown in Figure 3 (boxed areas)) are displayed for 
2 0 FbdelPASAF (SEQ ID No 5, Figure 9), FbdelPMOSAF (SEQ ID No 

6, Figure 10) , FbdelPGASAF (SEQ ID No 7, Figure 11), 
FbdelPRDSAF # (SEQ ID No8, Figure 12) and CMV10A1 (SEQ ID No 
9, Figure 13) are shown. 

2 5 The components of the viral particles are produced by two 

independent expression plasmids ( gag - pol or env ) which also 
contain selectable markers ( bsr or phleo ) expressed from the 
same transcriptional units as gag-pol or env (figs. l& 2) . 
The selectable markers are located downstream to gag - pol or 

3 0 env genes and there is an optimal distance between the stop 

codon of the upstream reading frames and the start codon of 
the selectable genes that should allow re-initiation of 
translation (Kozak, Mol Cell Biol. (1987) 7, :3438-3445) . 
Because there is no "Kozak" sequence (Kozak, 0611,(1986) 44: 
35 283-292) required for a normal initiation of translation for 



WO 97/08330 



PCT/GB96/02061 



14 



the marker gene, they can only be expressed by re-initiation 
of translation after the upstream viral gene has been 
successfully expressed. Consequently and also because re- 
initiation of translation is a poorly efficient process 
after transfection of these plasrnids, cells resistant to the 
drugs corresponding to those selectable genes express high 
levels of the viral proteins. 

To avoid viral transmission of these "helper" genomes the 
constructs used suitably have the classical deletions of 
both the packaging sequence located in the leader region and 
of the 3 ' LTR, the latter being replaced by SV40 
polyadenylation sequences (Figs 1 & 2) . 

Plasmid CeB is the MoMLV gag-pol -expression unit. it 
derives from pCRIP, a plasmid used to generate the 
constructs introduced in the CRIP and CRE packaging cell 
lines (Danos and Mulligan, 1988). As shown in fig. i f or 
generation of plasmid CeB the env gene of pCRIP has been 
deleted mostly and the bsr selectable marker, -encoding a 
protein conferring resistance to blasticidin (izumi et al., 
Experimental Cell Research (1991) 197, 229-233)- has been ' 
inserted downstream to sol gene. There are exactly 74 bp 
with no ATG triplets between the stop codon of pol and the 
start codon of bsr, this allows its expression by re- 
initiation of translation on the gag-pol mRNA, after 
translation of the gaa-Eol reading frame. 

FbdelPASAF is a plasmid expressing the amphotropic env gene 
and the phleo selectable marker conferring resistance to 
phleomycin (Gatignol et al . , FEBS Letters (1988) 230:171- 
175) . By using a PCR-mediated mutagenesis strategy which 
modifies the end of env gene (see fig. 2), a 76 bp linker 
was inserted between the stop codon of env and the start 
codon of Ehleo. This allows expression of phleo from the 
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env mRNA by re- initiation of translation. In addition 
compared to known env-expressing constructs, this strategy 
of construction has reduced the length of sequences 
overlapping with the ends of conventional retroviral 
vectors. The env genes of Mo-MLV, FeLVB , NZB.1V6, 10A1, 
GALV and RD114 are expressed by plasmids FBdelPMoSAF, 
FBdelPBSAF, FBdelXSAF, FBdelpGSAF , FBdelplOAlSALF and 
FBdelPRDSAF, respectively, by using the same backbone as 
FBdelPASAF (fig. 3) . Retroviral vectors produced with the 
RD114 envelope will be useful for in vivo gene delivery as 
comparatively to MLV ecotropic or amphotropic envelopes, 
virions pseudotyped with RD114 envelopes are not inactivated 
by human complement when they are produced by Mink Mv-l-Lu 
cells or by some human cells (Table 1) . 

The HT1080 cell line, isolated from a human fibrosarcoma 
(ATCC CCL121) . The TE671 cell line isolated from a human 
rhabdomyosarcoma (ATCC CRL 8805) (purchased from ATCC, and 
tested for absence of usual cell culture contaminants by 
ECACC) , has been used for the definitive construction of 
packaging cell lines. HT1080 line was chosen among a panel 
of primate and human lines because MLV-A and RD114 
efficiently rescued retroviral vectors from these cells and 
also because RD114 pseudotypes produced by this cell line 
were stable when incubated in human serum. In a standard 
assay (Takeuchi et al., J Virol (1994), 68, 8001-8007), 
these latter viruses were found more than 500 fold more 
stable than similar pseudotypes produced in 3T3 cells. 

Another advantage for the use of non murine cells to derive 
packaging lines is the absence of MLV- related endogenous 
retroviral-like sequences (like VL30 in 3T3 cells) that can 
cross-package with MLV-derived retroviral vectors (Torrent 
et al., 1994) and generate potentially harmful recombinant 
retroviruses . 
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The helper constructs were introduced into other cell" lines 
(HT1080 (table 2) Mink Mv-x-Lu (table 2), 3 T3 (not shown) 
TE671 (table 2)) for the purpose of comparisons of the ' 
efficiency of the constructs. 

As illustrated hereinafter (Table 2), the reverse 
transcriptase (RT) activity (provided by expression of the 
Pol gene) in cells transfected with CeB is significantly 
higher than that of the same cells transfected by the 
parental plasmid pCRlP or that of cells chronically infected 
by MLV. This enhancement of viral gene expression is 
correlated with the titers of lacZ retroviral vectors when 
an envelope is provided in CeB-lacZ cells after comparison 
with titers of lacZ pseudotypes of either replication- 
competent viruses or other helper-free packaging systems. 

For the generation of final packaging cell lines, the best 
clonal env transf ectants have been selected. Packaging 
systems obtained in this way will be able to produce helper- 
free retroviral vectors at titers greater than lo« 
infectious particles per ml, which would be 10-100 fold 
higher to helper- free preparations of others. 

Because of the way the selectable markers are expressed (see 
above) , growing the packaging cells in phleomycin and 
blasticidin selective pressure increase and stabilize the 
expression of the retroviral components and particularly the 
envelopes, as it is possible that env glycoproteins have 
toxic effects for the producer cells in the long term which 
may lead to a decrease of expression. 

Such an enhancement of viral production observed with the 
packaging systems described herein might increase the 
emergence of unwanted retroviruses having recombined between 
the genomes of both the retroviral vector and either of the 
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two packaging-deficient constructs. However, the constructs 
have been designed in such a way that it reduces the 
probability of emergence of recombinant viruses compared to 
the parental constructs. To check their safety, attempts 
have been made to detect the presence of replication- 
competent retroviruses by a mobilisation assay of a lacZ 
provirus. No RC viruses have been found in all retroviral 
vector preparations tested so far. 

The following Examples illustrate the invention. 

Example 1 

Preparation of Cell lines and viruses . 
The following cell lines were used: 

A204 (ATCC HTB 82), HeLa (ATCC CCL2) , HT1080 (ATCC CCL121) , 
MRC5 (ATCC CCL171) , T24 (ATCC HTB 4) , VERO (ATCC CCL81) and 
D17 (ATCC CCL183) were purchased from ATCC. 

HOS, TE671 and Mv-l-Lu cells and their clones harboring 
MFGnlslacZ retroviral vector as described by Takeuchi et 
al., J Virol (1994), 68, 8001-8007. 

The above cell lines were grown in DMEM (Gibco-BRL, U.K.) 
supplemented with 10% fetal calf serum. 

EB8 (Battini et aL f J. Virol (1992) 66: 1468-1475); 
psiCRE, psiCRELLZ and psiCRIP (Danos et al . , Proc . Natl. 
Acad. Sci USA (1988) 85: 6460-6464); 

Cells GP+EAM12 (Markowitz et al., Virology (1988), 167, 400- 
406) ; and 

NIH-3T3 murine fibroblasts. 

These cell lines were grown in DMEM (GIBCO-BRL, U.K.) 
supplemented with 10% new-born calf serum. 
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Mv-l-Lu, TE671 and HT1080 cells were transfected using 
calcium-phosphate precipitation method (Sambrook et 
"Molecular Cloning" 1989, Cold Spring Harbour Laboratory 
Press: New York) as described elsewhere (Battini et al., 
supra) . CeB-transfected Mv-l-Lu, TE671 and HT1080 cells 
were selected with 3, 6-8 and' 4 ^g/ml of blasticidin S (ICN 
UK) , respectively, and blasticidin-resistant colonies were ' 
isolated 2-3 weeks later. Cells transfected with the various 
env-expression plasmids were selected with phleomycin 
(CAYLA, France) : 50 M g/ml (for FBASALF- transfected cells) or 
10 Mg/ml (for FBASAF-, FbdelPASAF- , FbdelPMOSAF, 
FBdelPlOAISAF or FBdelPRDSAF- transfected cells)! Phleomycin- 
resistant colonies were isolated 2-3 weeks later. 

Production of lacZ pseudotypes using replication competent 
viruses, amphotropic murine leukemia virus (MLV-A) 1504 
strain and cat endogenous virus RD114 , was carried out as 
described previously (Takeuchi et al . , J virol (1994) 6 8 
8001-8007) . 

Example 2 

Preparation of Plasmids. 

The env gene of pCRIP (Danos et al . , supra) was excised by 
Hpal/Clal digestion. A 500 bp PCR-generated DNA fragment was 
obtained using pSV2-bsr (Izumi et al . , Experimental Cell 
Research (1991), l 97/ 299-233) as template and a pair of 
oligonucleotides: 

(5 ' >CGGAATTCGGATCCGAGCTCGGCCCAGCCGGCCACCATGAAAACATTTAACATTTC 
TC) (SEQ ID NO 2) at 5 ' end and 

( 5 ' >GATCCATCGATAAGCTTGGTGGTAAAACTTTT ) (SEQ ID No 3 ) at 3 < 
end, with Sfil and Clal sites, respectively. This fragment 
was inserted in Hpal/Clal sites of pCRIP by co-ligation with 
a 85 bp Hpal/Sfil DNA fragment isolated from pOXEnv (Russell 
et al., Nucleic Acids Research (1993), 21, 1081-1085) which 
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provides the end of the Moloney murine leukemia virus 
(MoMLV) pol gene. The resulting plasmid named CeB (Fig. 1) 
could express the MoMLV gag-pol gene as well as the bsr 
selectable marker conferring resistance to blasticidin S, 
both driven by the MoMLV 5 ' LTR promoter. 

A series of env-expression plasmids was generated using the 
4 07 OA MLV (amphotropic) env gene (Ott et al., J Virol 

(1990) , 64, 757-766) and the FB29 Friend MLV promoter 
(Perryman et al., Nucleic Acid Res (1991), 19, 6950). In 
FBASALF (Fig. 1) a Bglll/Clal fragment containing the env 
gene was cloned in BamHI/Clal sites of plasmid FB3LPh which 
also contained the C57 Friend MLV LTR driving the expression 
of the phleo selection marker. A 136 bp env fragment was 
generated by PCR using plasmid FB3 (Heard et al . , J Virol 

(1991) , 65, 4026-4032) as template and a pair of 
oligonucleotides: (5' >GCTCTTCGGACCCTGCATTC) (SEQ ID NO 4) at 
5' end (before Clal site) and 

( 5 ' >TAGCATGGCGCCCTATGGCTCGTACTCTATAGGC) (SEQ ID NO 5) at 3' 
end, providing a KasI restriction site immediately after the 
env stop codon. This PCR fragment was digested using Clal 
and KasI. A DNA fragment containing the FB29 LTR and the 
MLV -A env gene was obtained by Ndel/Clal digestion of 
FBASALF. The fragments were co-ligated in Ndel/KasI digested 
pUT626 (kindly provided by Daniel Drocourt, CAYLA labs, 
France) . In the resulting plasmid, named FBASAF (Fig. 1) , 
the phleo selectable marker was expressed from the same mRNA 
as the env gene. A Bglll restriction site was created after 
the MscI site at position 214 in the FB29 leader by using a 
commercial linker (Biolabs, France) . A Ndel/Bglll fragment 
containing the FB2 9 LTR was co- inserted with the Bglll/Clal 
env fragment in Ndel/Clal-digested FBASAF plasmid DNA, 
resulting in plasmid FBdelPASAF (Fig. 1) . Compared to 
FBASAF, FBdelPASAF has a lOObp larger deletion in the leader 
region. 
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Example 3 



Cloning and Sequencing of the RD114 env gene 

The R D114 env gene was first sub-cloned in pZasmid 

Bluescr.pt KS. (stratagene, as a 3 Kb „ inam inserc 

.7 T v fr °: ! C3C - M RD114 in£eCCiOUS "» =^ne ee.es et 
al., J. Virol (1984), 52, 164-171) Aon uk , 

fragment of this subclone conta ng t„ , D 4 T 
seguenced (Figure 4 (SEQ a m . m — 

X8782 9) Tho c, accession number; 

X87829) . The 5' non-coding sequence upstream of an Nd.r ■ 

- - fragments wire ^J?^^ 
fragment and a 63 bp PCR-generated DNA fragment 'usi M 
{5 >CGCCTCATGGCCTTCATTAA) (SEQ OD N0 S) at^- e n d b!f 

:i" T> . and d (5 ' -™--~ TTC ;; d c; b :r ID 

ZZ RD114 " Pr ° VidinS 1 "«-«-n site just 

after RD114 env gene stop codon . The PCR fragment was 
digested with Hcol and Kasl. Both fragments^" co 

re S S ultin? D r een H E9111 "* ^ ^ ° f ™ Sl ™>* - the 
resulting plasmid was named FBdelPRDSAF (Fig 1) 

sci vlTlitTl^ 9 ' <DaDOS - °- " >™ *cad 

USA (198e, «■««<>-««*> was used for transf action. 

Example 4 
Infection assays. 

p'eTwellTandT »>*~ <-°« cells 

Per well) and were incubated overnight. Infections were then 
carried out at 37-c by piating i ml dilutions of " 
supernatants in the presence of 4 re/ml po l y brene (siL, 
-rget cells. 3 h iater virus -containing Ldl ™ s ^ J 
by fresh medium and infected cells were incubated J 
days before x- gal staining, performed as previously 
described (Taiior et al., Jvi r 0l ,„„,. „, 673 / 6741 
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Takeuchi et al . , J Virol (1994), 68, 8001-8007). Viral 
titers were determined by counting lacZ-positive colonies as 
previously described (Cosset et ai . , J. Virol. (1990) 64: 
1070-1078) . Stability of lacZ pseudotypes in fresh human 
5 serum was examined by titrating surviving virus after 

incubation in 1:1 mixture of virus harvest in serum- free 
medium and fresh human serum for 1 h at 37 °C as described 
before (Takeuchi et al . supra). 

10 Example 5 

Reverse transcriptase (RT) assay. 

RT assays were performed either as described previously 
(Takeuchi et al . supra) or using an RT assay kit (Boehringer 
15 Mannheim, U.K.) following the manufacturer's instruction but 

using MnCl 2 (2 mM) instead of MgCl 2 . 

Example 6 

20 Screening producer cell lines. 

Viral particles generated with RD114 envelopes have been 
found to be more stable in human serum than virions with 
MLV-A envelopes and that the producer cell line also 
controls sensitivity (Takeuchi et al. supra) . A panel of 

25 cell lines was screened for their ability to produce high 

titer viruses and for the sensitivity of these virions to 
human serum. To do this, cells were infected at high 
multiplicity with lacZ pseudotypes of either MLV-A or RD114 
and cells producing helper-positive lacZ pseudotypes were 

30 established. Human HT1080 and TE671 and mink Mv-l-Lu cells 

were found to release high titer lacZ(RD114) and lacZ (MLV-A) 
viruses. LacZ (MLV-A) pseudotypes produced by HT1080 cells 
were more resistant to human serum than those produced by 
other cells. The titer of these viruses was only four-fold 

3 5 less following a 1 hr incubation with human serum than a 
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control incubation (Table 1). LacZ(RD114) pseudotypes" 
produced by human cells or mink Mv-i-Lu cells were in 
general stable in human serum (Table 1). These results 
suggested that HT1080, TE671 and Mv-i-Lu cells provided the 
best combination of high lacZ titers and resistance to human 
serum and they were therefore used for the generation of 
retroviral packaging cells. 

Table 1. Titer and stability of l acZ pseudotypes. 



Producer 
cell 



A2 04 

HeLa 

HOS 

HT1080 

MRC-5 

T24 

TE671 



LacZ (MLV-A) 



Titer 3 



VERO 



650 
9 

4, 500 

2, 000, 000 

450 

350 

15, 000 
260 



Stability* 3 Titer a 



LacZ( pmizn 



<3 
nd 
6 

26 
10 
nd 
2 

nd 



1,200 
2, 000 
23 , 000 
400, 000 
1,000 
1, 200 
90, 000 

90 



Stability 13 



105 

115 

86 

129 

nd 

nd 

38 

nd 



D17 



900 



<1 



200, 000 



Mv-l-Lu 



80, 000 



200, 000 



120 



a: titration on TE671 cells as lacZ i.u./ml 
b: % of infectivity of human serum 
treated viruses 



-treated viruses compared to fetal calf 



serum- 



Example 7 



Construction of an improved gag-pol expression vector 
A MoMLV gag-pol expression plasmid, CeB (Fig. i, , was 
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derived from pCRIP (Danos et al . , Proc. Natl. Acad Aci USA 
(1988) 85: 6460-6464) . Approximately 2 Kb of env sequence 
were removed from pCRIP and the bsr selectable marker, 
conferring resistance to blasticidin S (Izumi et al . , 
5 Experimental Cell Research (1991) 197:229-233), was inserted 

74 nts downstream of the gag-pol gene. This 74 nts interval 
had no ATG triplets and was thought to provide an optimal 
distance between the stop codon of the pol reading frame and 
the start codon of the bsr gene to allow re-initiation of 

10 translation (Kozak Mol Cell Biol., 1987, 7: 3438-3445). 

There was no "Kozak" consensus sequence (Kozak Cell, (198 6) 
44; 283-292) at the 5' end of the marker gene. Therefore, 
bsr could only be expressed by re -initiation of translation 
after the upstream gag-pol gene had been expressed. 

15 Consequently, after transfection of CeB in Mv-1- 

Lu/MFGnlsLacZ (ML) , TE671/MFGnlsLacZ (TEL) or HT1080 cells, 
blasticidin S-resistant bulk populations and most cell 
clones expressed high levels of gag-pol proteins assessed by 
the reverse-transcriptase (RT) activity found in cell 

20 supernatants (Table 2) . Considerably higher RT activities 

were found in bulk populations of CeB-transf ected ML cells 
compared to bulk population of ML cells stably transf ected 
with the parental pCRIP construct. Similarly the RT 
activities of two packaging cell lines generated using 

25 pCRIPenv- construct, psiCRE cells (Danos et al . , supra) and 

EB8 cells (Battini supra.) were less than that of CeB 
transf ected clones (Table 2) . Finally, RT activitiy in CeB 
transfected cell supernatants was higher than that of cells 
chronically infected by replication-competent MLV-A (Table 

30 2) . 

Table 2. Secreted reverse transcriptase expression 



Cell a RT activity* LacZ Titer c 

35 
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ML/MLV- A 


1 


8xl0 4 




MLSvB 


0 . 1 


<1 




0 . 15 


nd 




1 . 7 


nd 




4.2 


1x10 s 




1.6 


lxlO 6 


X EiIj / l v i±j V - A 


3.6 


2xl0 6 




5.2 


4xl0 7 




T-TT1 flfln /k/tt tt 7\ 


1.1 


lxlO 6 


ri i Leo b 


1.9 








1x10 6 


HTCeBl8 


2.7 


2x10 s 


HTCeB22 (FLY) 


6.9 


5xl0 6 


HTCeB4 8 


5.5 


3xl0 6 


EB8 


0.22 


lxlO 4 


psiCRE-LLZ 


1.2 


lxl0 5d 



a: ML, Mv-l-Lu cells harboring a MFGnlslacZ provirus- TEL TES7, u 

a MFGnlslacZ provirus- /MLV A ,-»n • • , Cells har *>°ring 

strain- MLSvB mI ^ ironically infected- with MLV -A 1504 

tram. MLSvB , ML cells transfected with a plasmid pSV2b Sr alone- mlcrip mt 
cells co- cransfected with pCRlP and pSV2bsr. ' " L 

b: Average of arbitrary units relative to ML/MLV-A RT activity of al - , 

and EB8 cells.- „ d . ™, Cl ° neS ' TELCeB clones < HTCeB clone-s 



and EB8 cells; n d, not done. 
d: titration on NIH3T3 cells 



To rescue infectious lacZ viruses, MLCeB and TELCeB clones 
were transfected with FBASALF DNA, a plasmid designed to 
express the MLV -A env gene (Fig. !) . Bulk populations of 
stable FBASALF transf ectants were isolated and supernatants 
were titrated using TE671 cells as targets. Titers of lacZ 
viruses were higher than either MLV -A infected ML or TEL 
cells, or FBASALF- transfected EB8 cells (Table 2) These 
data suggested that CeB was an extremely efficient MLV gag- 
pol expression vector in mink Mv-l-Lu and TE671 cells CeB 
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was therefore used to derive packaging cells by transf ection 
of HT1080 cells. 41/49 blasticidin S-resistant colonies had 
detectable levels of RT; 9 had RT activity higher than that 
of control MLV-A- infected HT1080 cells (data not shown) . 
Expression of gag precursor was confirmed in cell lysates 
and supernatant s of these 9 HTCeB clones by immunoblott ing 
using antibodies against p30-CA (data not shown) , The 4 
clones with the highest expression of gag proteins (clones 
6,18,22 and 48) were infected at high-multiplicity with 
helper free, lacZ pseudotypes bearing MLV-A envelopes 
(MFGnlslacZ(A) ) produced by TELCeBG/FBASALF (Table 3) and 
then trans fected with FBASALF . Supernatants of bulk, 
phleomycin-resistant transf ectants were assessed for RT 
activity and lacZ titer (Table 2) . Clone HTCeB22, named FLY, 
was found to be the best gag-pol producer clone and was used 
to introduce env expression vectors for the generation of 
packaging cell lines. 
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Table 3. Titer following env construct trans fection 



Producer cell 



psiCRIP lacZ 5 
GP+EAM12 lacZ 25 
TELCeBG 

TELCeBS 



FLY C 



FLYA4 lacZ 3 
FLY d 



Env source 



pCRIPAMgag- 
envAM 



Titer 3 



6xl0 4b 
3xl0 5b 



FBASALF C 




5xl0 7 


FBASAF° 




2xl0 7 


FbdelPASAF* 




2xl0 7 


FBdelPASAF 


1 


3xl0 7 


FbdelPASAF 


4 


2xl0 7 


FbdelPASAF 


6 


lxlO 7 


FbdelPASAF 


7 


5xl0 7 


FbdelPASAF 


8 


lxlO 7 


FbdelPRDSAF 2 


1x10 s 


FbdelPRDSAF 4 


3x10 s 


FbdelPRDSAF 7 


lxlO 7 


FbdelPRDSAF 8 


2xl0 6 


FBdelPASAF 


1 


lxlO 1 


FbdelPASAF 


4 


1 . 5xl0 6 


FbdelPASAF 


5 


lxlO 6 


FbdelPASAF 


7 


lxlO 6 


FbdelPASAF 


13 


7xl0 6 


FbdelPASAF 


14 


4x10 s 


FbdelPASAF 


15 


lxlO 6 


FbdelPASAF 


16 


5xl0 6 


FbdelPASAF 


17 


6x10 s 


FBdelPASAF 


4 


2xl0 7b 


FBdelPRDSAF 


1 


2 ,5x10 s 


FbdelPRDSAF 


2 


lxlO 7 


FbdelPRDSAF 


6 


5xl0 6 


FbdelPRDSAF 


10 


2x10 s 


FbdelPRDSAF 


11 


3xl0 6 


FbdelPRDSAF 


13 


lxlO 6 


FbdelPRDSAF 


17 


5xl0 6 


FbdelPRDSAF 


18 


3xl0 7 


FbdelPRDSAF 


19 


6xl0 6 



Average titers of at least three independent experiments were shown The 

standard errors did not exceed 30 % of the titer values. 

a: titrated on TE671 cells as lacZ i.u./ml 

b: results of best MFGnlslacZ producer clones. 



WO 97/08330 



PCT/GB96/02061 



27 

c: bulk populations of env- transf ectants in TELCeBS cells. 

d: titration after bulk infection with helper-free MFGnlslacZ, 



Example 8 

Construction of env expression vectors. 
A series of MLV-A env expression plasmids were then 
generated (Fig. 1) . In FBASALF, the env gene was inserted 
between two Friend-MLV LTRs, its expression driven by the 
FB29 MLV LTR (Perryman et al., supra) . Most of the packaging 
signal located in the leader region was deleted. This 
plasmid also expressed the phleo selectable marker (Gatignol 
et al., supra) driven by the 3' LTR. FBASAF and FBdelPASAF 
were then designed following the same strategy used for CeB. 
These two vectors differed only by the extent of deletion of 
the packaging signal, FBdelPASAF having virtually no leader 
sequence. Compared to pCRIPAMgag- and pCRIPgag-2 env 
plasmids expressed in psiCRIP or psiCRE packaging cells 
(Danos et al . , supra) about 5 Kb of gag-pol sequences was 
removed. In addition the 258 bp retroviral sequence 
containing the end of env gene and the begining of U3 found 
in pCRIPAMgag- and pCRIPgag-2 was also removed. For both 
FBASAF and FBdelPASAF plasmids, the phleo selectable marker 
was inserted downstream of the env gene by positioning a 76 
nts linker with no ATG codons between the two open-reading 
frames. Phleo could therefore only be expressed by re- 
initiation of translation by the same ribosomal unit that 
had expressed the upstream env open reading frame . 
FBdelPASAF was also used to generate FBdelPRDSAF, an RD114 
envelope expression plasmid (Fig. 1) . 

After transfection of the env plasmids into TELCeBG cells 
(Table 2) , bulk populations of phleomycin-resistant colonies 
were isolated and their production of lacZ virus measured 



WO 97/08330 



PCT/GB96/02061 



28 



(Table 3). FBASALF gave a titer of 5x10' lacZ-i .u. /ml," whilst 
titers with either FBASAF or FBdelPASAF were 2xl0 7 lacZ 
x.u./ml (Table 3). Titers of 5xl0 7 or 10 ' lacZ-i . u . /nl could 
be obtained with some FBdelPASAF cell clones or FBdelPRDSAF 

clones , respectively. 

As FBdelPASAF has minimal virus -derived sequences and was 

shown to be the safest construct (see below and Table 4) 

it and FBdelPRDSAF were used to generate packaging lines' 
from F LY cells (clone HTCeB22; ^ 2) _ 

of these clones was assayed by interference to challenge 

TELC e ™ Sl " Z(A> " MFGnlSlacZ(RD) P-udotypes produced by 
TELCeB 6 / FBde 1 PASAF - 7 or TELCeB6/FBdelPRDSAF-7 , respectively 
(Table 3) . The cell l ines showing most interference were 
cross-infected at high multiplicity with these pseudotypes 
to provide MFGnlslacZ proviruses, and supernatants were then 
txtrated on TE671 cells (Table 3). FL Y - FB de 1 PASAF - 1 3 (FLYA13 
packaging line) and FLY- FBdelPRDSAF- 18 (FLYRD18 packaging 
line) gave the highest productions of lacZ viruses, around 
10 lacz-i.u./ml. The best MFGnlslacZ producer clones derived 
from either psiCRIP cells (Danos et al . , SU pra) or GP+EAM12 
cells (Markowitz et al., supra) gave approximately 50 fold 
lower titers (Table 3). The lacZ titers of the FLY-derived 
lines shown in Table 3 are lower than the best TELCeB 6- 
derived lines after transf ection of either FBdelPASAF or 
FBdelPRDSAF (Table 3). However it should be noted that the 
lacZ provirus expressed in TELCeB 6 cells was obtained after 
clonal selection but was introduced polyclonally in FLY- 
derived env-transf ected cell clones. When FLY -FBdelPASAF -4 
cells (FLYA4 packaging line), infected with helper-free 
MFGnlslacZ (RD) were cloned by limiting dilution the best 
clones (eg. FLYA41acZ3) were found to produce 20 times more 
infectious viruses than the bulk population, reaching the 
range of titers obtained with the best TELCeBS -FBdelPASAF 
clones (Table 3) . 



WO 97/08330 



PCT/GB96/02061 



29 

Example 9 

Assays for transfer of gag-pol or env functions. 
To assay for replication- competent viruses, supernatant s 
were used to infect TEL cells (a clone of TE671 cells 
harboring an MFGnlslacZ provirus) . Infected cells were 
passaged for 6 days or longer and their supernatants were 
used for infection of fresh TE671 cells. No transmission of 
lacZ viruses could be detected (Table 4) , demonstrating that 
the supernatants of pCRIPAMgag-- , FBASALF-, FBASAF-, or 
FBdelPASAF - trans fected TELCeBS cells were helper-free. 
Similar absence of replication competent recombinant 
retroviruses was demonstrated using supernatant from a 
clone of psiCRIP-MFGnlslacZ cells or from two clones of 
FLYA- MFGnlslacZ cells (Table 4) . 

There have been reports that helper- free retroviral vector 
stocks may nevertheless contain recombinant retroviruses 
(replication incompetent) carrying either gag-pol or env 
genes (Bestwick et al . , Proc Natl Acad Sci USA (1988), 85, 
5404-5408, Cosset et al . , Virology (1993), 193, 385-395, 
Girod et al . , Virology (1995), in press). To assay for such 
recombinant retroviruses, mobilisation of an MFGnlslacZ 
provirus from two indicator cell lines which could cross- 
complement potential recombinant viruses carrying either 
gag-pol or env functional genes was attempted. The TELCeB6 
line (Table 2) expressing gag-pol proteins was used as 
indicator cell line to test for the presence of env 
recombinant (ER) viruses. The TELMOSAF indicator line 
expressing MoMLV env glycoproteins (obtained by transfection 
of FBMOSAF, a plasmid expressing the MoMLV env gene using 
FBASAF backbone, in TEL cells) was used to detect the 
presence of gag-pol recombinant retroviruses (GPR viruses) . 
After passaging 4-8 days, the supernatants of the infected 
indicator cells were used to infect either human TE671 cells 
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or murine NIH3T3 cells. 



TELCeB6 cells transfected with various env- expressing 
constructs, pCRIPAMgag-, FBASAF and FBdelPASAF were 
compared. Although the supernatants of TELCeBS -FBdelPASAF 
cells were devoid of replication-competent retroviruses 

Table 4>. mo GPR viruses could be detected when less than 
2X10 vxrxons were used to infect the indicator cells 
Similarly TELCeB6 indicator cells infected with various 
helper-free viruses were shown sporadically to release lacz 
virions (Table 4) . The number depended both on the env 
expression vector used and on the virus input quantity 
Compared to lacZ viruses generated using pCRlPAMgag- 
plasmxd, the frequency of detection of the env- recombinant 
vrruses was lower for supernatants generated by using FBASAF 
and FBdelPASAF constructs (Table 4) . For FBdelPASAF 
construct when less than 5x10* MFGnlslacZ (A) helper-free 
vxrxons were used to infect the indicator cells, no ER 
retroviruses could be detected. From these experiments, it 
could be estimated that a supernatant, produced from 

o T f L MPC 6 ; F r iPASAF CellS ' C ° ntaining 1x107 -^ctious units 
of MFGnlslacZ retroviral vector contained no replication- 
competent virus, and about 100 gag-pol and 100 env 
recombinant retroviruses. 
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Table 4 . Transfer of packaging function 
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Producer cell 


Indicator cell 


Input virus" 


Detection 0 






(lacZ-i.u.) 


+ + 


+ 




Replication competent virus 


psiCRIP lacZ 5 


TEL 


2xl0 4 


0/4 


0/4 


.4/4 


TELCeB6-pCRIPAMgag- 


TEL 


5xl0 6 


0/4 


0/4 


4/4 


TELCeB6-FBASAF 


TEL 


5xl0 6 


0/4 


0/4 


4/4 


TELCeB6-FBdelPASAF 


TEL 


5xl0 6 


0/4 


0/4 


4/4 


FLYA4 lacZ 3 


TEL 


lxlO 7 


0/4 


0/4 


4/4 


FLYA4 lacZ 7 


TEL 


lxlO 7 


0/4 


0/4 


4/4 


Gag-Dol recombinant 


TELCeB6-FBdelPASAF 7 


TELMOSAF 


2xl0 7 


0/4 


1/4 


3/4 


TELCeB6-FBdelPASAF 7 


TELMOSAF 


2xl0 6 


0/4 


2/4 


2/4 


TELCeB6-FBdelPASAF 7 


TELMOSAF 


2X10 5 


0/4 


2/4 


2/4 


TELCeB6-FBdelPASAF 7 


TELMOSAF 


2x10* 


0/4 


0/4 


4/4 


Env recombinent 


TELCeB6-pCRD?AMgag- 


TELCeB6 


5xl0 6 


2/4 


1/4 


1/4 


TELCeB6-pCRIPAMgag- 


TELCeB6 


5x10 s 


1/4 


1/4 


2/4 


TELCeB6-pCRIPAMgag- 


TELCeB6 


5xl0 4 • 


0/4 


2/4 


2/4 


TELCeB6-FBASAF 


TELCeB6 


5xl0 6 


0/4 


2/4 


2/4 


TELCeB6-FBASAF 


TELCeB6 


5xl0 5 


0/4 


1/4 


3/4 


TELCeB6-FBASAF 


TELCeB6 


5xl0 4 


0/4 


1/4 


3/4 


TELCeB6-FBdelPASAF 


TELCeB6 


5xl0 6 


0/4 


1/4 


3/4 


TELCeB6-FBdelPASAF 


TELCeB6 


5x10 s 


1/4 


3/4 


0/4 


TELCeB6-FBdelPASAF 


TELCeB6 


5xl0 4 


0/4 


0/4 


4/4 


a: number of lacZ i.u. used to 


infect indicator 


cells 








b: number of incidence out of 


four experiments . 


The ranges of 


lacZ 


titers 




rescued from infected indicator cells are shown 


for each virus input: 


>100 


lacZ i.u. /ml {++) , 1-100 lacZ 


i.u. /ml (+) and 


<1 lacZ i.u. /ml (-) . 







Titers were determined on TE671 cells for replication 
competent virus and env recombinant and NIH3T3 cells for 
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gag-pol recombinant. 



Example 10 

-put":: cmsirm resiscance to «* absence of 

MFGnlslacZ (A) ana (RD) harvested from FLYA13 and FLYRD1 

zmriT ; £ter poiycionai - -™ * 
^L" » d g :: air:: 0 rv or stabuity in *»■» 

independent samples of fresh human serum' T. o'Z 
of. control incubations. while titers of MFCnlsl l z ° "° * 
FLYA13 were so to sn » „t . . , , uttj nislacZ (A) from 
ranlir..-- controls (data not shown) . No 

replication competent virus was detected in th» = 
d eqP Hu fl j u CLLea ln tne same assav 

described above (Table 4) when 1 x 10' i. u . each of 

MFGnlslacZ(A) and (RD) were tested. 



EXAMPLE 11. 



Generation of plasmids. 

furthiriodif'l' S> SXPreSSin9 M ° MLV 9S9 - P01 ^ — 
further modrfred to remove the splice donor site located in 
the leader region A n-> h„ e located in 

usino OUSD ,t I' P fragment " as "^-generated by 

usrngOOSD- (5 -TCTCGCTTCTGTTCGCGCGC) andOLSD- 

HrndlTT 1 " BSSH " HindI11 - A1 °°S bp 

Hindltl-Xhol fragment isolated from CeB (encompassing a part 
of leader sequence and beginning „ OML v gag, and the PC^ 

in^izr: co " inaerted int ° pceB *™ ^ * 

ine resulting- plasmid, named pCeB DS- (Pio ^ 

r^L? deleCi ° n ° £ SPUCS ' SD ' «" " * « 

restrrctron srte created Just downstream to the lost SD 
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A series of gag-pol expression plasmids in which the MoMLV 
LTR promoter was replaced by the human cytomegalovirus 
immediate early promoter (hCMV promoter) was derived from 
both CeB DS- and hCMV-G (Yee et al . , 1994 PNAS, 91: 9564- 
956 8) , a plasmid used as a source for the hCMV promoter. A 
Notl-f illed/EcoRI 7260 bp fragment was isolated from CeB DS- 
and cloned into hCMV-G which had been opened with Sail 
(further rendered blunt -ended) and EcoRI to remove the VSV-G 
gene. The resulting plasmid was cutted with Clal and EcoRI 
to remove a 1155 bp fragment encompassing sequence derived 
from 3' -LTR and SV40 polyA sequence and self-ligated after 
filling both protruding DNA ends. The resulting plasmid, . 
named phCMV-intron (Fig. 5) , had gag-pol and bsr ORFs 
inserted between the CMV promoter and rabbit beta-globin 
polyA post-transcriptional regulatory sequences. 

An intermediate plasmid was generated by sub-cloning a 7260 
bp EcoRI fragment (isolated from CeB DS - ) into hCMVG opened 
with EcoRI. A 1155 bp fragment (encompassing sequence 
derived from 3' -LTR and SV40 polyA sequence) was removed 
from this intermediate plasmid which was then 
re-circularized by self ligation after filling both ends. 
The resulting plasmid, named phCMV+intron 2P (Fig. 5), was 
digested with Not I and the vector was treated with klenow 
enzyme. A 1440 bp fragment (encompassing hCMV promoter and 
rabbit beta-1 globin intron B (Rohrbaugh et al., 1985 Mol . 
Cell Biol, 5: 147-160)) was isolated from phCMV+intron 2P by 
Notl/EcoRI digestion. This fragment was further treated with 
klenow enzyme and ligated back into the vector. The 
resulting plasmid, named hCMV+intron (Fig. 5.) , could express 
gag-pol and bsr genes driven by the hCMV promoter and beared 
an intron sequence derived from rabbit beta-l globin intron 
B having both SD and SA (splice acceptable) sites. 

A 2450 bp fragment was removed from phCMV+intron 2P by 
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Notl/Xhol digestion. The resulting vector fragment was then 
used to co-ligate a 1330 bp fragment (containing hCMV 
promoter + 5 ' end of rabbit beta-1 globin intron B (with SD 
site)) isolated from phCMVG by A P aI-f illed/Notl digestion 
and a 1 kb fragment isolated from phCMV+intron 2P by 
Notl-filled/XhoI digestion. Compared to P hCMV + intron 2P the 
resulting plasmid, named hCMV + SD intron (Fig. 5 ) , had the 
deletion of the 3' end of the rabbit beta-! g i obin intron fl 
and thus no SA site in the leader region. 

(Savard et al . , unpublished). This plasmid, in which gag-pol 
and bsr genes were driven by the hCMV promoter, had the 
MoMLV SD site in the leader region. 

Gag-pol expression. 

The different constructs, including the parental CeB 
Plasmid, were analysed comparatively in a complementation 
assay after transfection in TEL - FBde 1 PASAF cells expressing 
4070A-MLV (amphotropic) envelope and harboring a MFGnlslacZ 
proyxrus; The transient production of lacZ retroviruses as 
well as the stable production of lacZ retroviral vectors 
after selection with blasticidin S were determined (Table 
5) - All the constructs were able to rescue infectious lacZ 
retroviruses indicating the expression of gag-pol proteins 
after transient transfection. Most likely due to the 
efficient hCMV and rabbit beta-1 globin intron B 
(post) -transcriptional regulatory sequences, hCMV+intron was 
particularly potent in transient retroviral vector 
production. However, io times less blasticidin-resistant 
colonies were obtained with hCMV + intron comparatively to 
CeB, and stable lacZ virus production from hCMV.intron was 
about 5-10 times lower than that of CeB . Clonal examination 
of lacZ retrovirus production from blasticidin-resistant 
colonxes indicated that 80-90% of colonies could express 
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high levels of gag-pol proteins for both hCMV+intron and CeB 
plasmids. In contrast, despite variation in their ability to 
form blasticidin-resistant colonies after transfection and 
despite their ability to express gag-pol proteins from 
transient transf ectants , all other constructs had a weak 
capacity for rescuing lacZ retroviral vectors from stable 
transf ectants {Table 5) . 



Table 5. Comparative study of gag-pol-bsr plasmids. 



gag-pol -bsr 
plasmid 


Transient 
(lacZ 
i . u . /ml ) 


no clones 
bsr* 


Stable 

(lacZ 
i . u . /ml 


% gag-pol 
/bsr 


Ceb 


3 00 /ml 


50 


10 7 


90% 


Ceb DS- 


144 /ml 


5 


10 5 


50% 


hCMV+intron 
2P 


ND 


20 


10 s 


50% 


hCMV-intron 


8 12 /ml 


0 






hCMV+SD 
in.tr on 


150/ml 


1000 


10 2 


nd 


hCMV+leader 


328/ml 


1000 


10 2 -10 3 


nd 


hCMV+intron 


12000/ml 


5 


10 s -10 7 


80% 



Northern blot analyses were performed on stable 
transf ectants (blasticidin-resistant) obtained with some of 
the gag-pol-bsr plasmids. As expected, the results (not 
shown) displayed a correlation between expression of gag-pol 
mRNAs and gag-pol protein expression detected by rescue 
analysis (Table- 5) . CeB construct was found to produce 2-3 
fold more gag-pol mRNAs compared to hCMV+intron. 
Interestingly, an unexpected 2.45 kb RNA band was found for 
hCMV+intron construct at a ratio of 2:1 compared to the 
abundancy of the gag-pol mRNA band (at 5.95 kb) . Further 
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ZT T ^ USi " 9 ° ther Pr ° beS " Vealad a cryptic 

the middle of the CA coding region at position 
-numbering according to shinnick et ai., 1M1 Nacure 
(London) 293: 543-548) was activated in this latter 
construct. The ENA species, lacing the 3. half of the 

gag gene and most of the pol gene, is unlikely to give r se 
to any useful translational product. lt is therefore 
interesting to notice that hCMV.intron construct was able to 
" I T 'V"*"* «»« transcripts ,gag-pol 5. 9 5 m^A 
2 ,*S alternative RHA band) compared to gag-pol mRNA 
expressed from CeB construct. Therefore we decided to 

TnltZ^ CryP " C SD " ^^"ron construct 

in order to increase the ratio of gag-pol mRNAs . 

Assays for transfer of gag-pol functions 

with^B 21 SU r rnataMS ° £ Pa=aka9in * ^ derated 

with CeB gag-pol expression contruct were devoid of 

replication-competent retroviruses, they were found 

sporadically to transfer gag-pol genomes (example Table 

IJTolT " " 1 - 1995 J ' VirCl 7 «<™- Because 
IZZl J, 00 ™™" 9enerated he ~ b V "-"9 the hCMV 
promoter had much less retroviral sequences homologous to 
the retroviral vector than the parental CeB construct (Fig 

I'm 7 *™ 1SSS " 9iVS " Se " -ombinlnt 

(GPR) viruses. Therefore, the most efficient 9 ag- P ol-bsr 
Plasmids. hCMV.intron and CeB, were further analysed for 
emergence of GPR viruses. To assay for such recombinant 
retroviruses, we attempted to mobilise an lacZ provirus from 
an indicator cell lines which could cross-complement 
genes It "f^ 1 ""' Virus " gag-pol functional 

with dat dlSPlayed " TaHe 6 *>™* «" consistently 

with da a reported previously (example Table 4, (Cosset el 

CeB'gaT T "ctor. generated by using 

CeB gag-pol construct were contaminated with GPR viruses In 
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contrast lacZ retrovirus vectors generated by using 
hCMV+intron construct were completely devoid of such GPR 
viruses, suggesting that this construct was 'improved 
compared to CeB with respects with emergence of recombinant 
viruses . 



Table 6. Comparative study of gag-pol-bsr plasmids. 



plasmid 


input virus 
(lacZ i.u.) a 


no of experiments 
giving titres of b 


CeB 


5xl0 6 


5 


3 


0 




5xl0 5 


2 


4 


2 




5xl0 4 


0 


1 


7 


hCMV+intron 


5xl0 6 


0 


0 


8 




5x10 s 


0 


0 


8 " 




5xl0 4 


0 


0 


8 



4xl0E4 cells of TEL/MOSAF in 24 wells were challenged with lacZ (A) of i.u. 
indicated in the table (a), and incubated at 37°C for 3 days. Cells were 
trypsinized and transferred into small flasks. Cell sup was harvested on day 5 
after lacZ (A) challenge and plated on either TE571 (not shown) and 3T3 cells 
(b) . No lacZ was mobilized into TE671 at all. LacZ (A) from CMV-int 10 again 
did not rescue lacZ from TEL/MOSAF. 

Example 12 

Generic primers to detect D-type (Medstrand and Blomberg 
J.Virol. (1993) 67:6778-6787) , C-type (Shih et al . , J 
Virol. (1989) 63:64-75), human endogenous virus RTVL-H 
(Wilkinson et al., J.Virol. (1993) 67:2981-2989), by RT-PCR 
were employed (Patience et al., supra). Primers to detect 
mouse endogenous VL3 0 element (Adams et al Mol . Cel .Biol . 
(1988) 8:2989-2998), and MFGnlslacZ RNA were designed and 
synthesized (TABLE X) . Overnight supernatants (in 4ml of 
culture medium) from 106 cells of GP+EAM12lacZ25 , FLYA4lacZ3 
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and TELCeB 6 FBASALF cells (Table 3) were harvested and ' 
centnfuged in sucrose gradient as described previously 
(Patience et al . , J.Virol., 70:2654-2657). Fractions 
containing retrovirus particles were collected, and RNA 
extracted. One twentieth of the RNA preparation or 
dilution's thereof were applied to RT-PCR as described 
previously (Table X) . A 1/200 of RNA harvested from 
GP + EAMl21acZ25 cells was positive for VL30 RNA. MFGnlslacZ 
RNA was found from 1/20 of RNA from GP + EAMl21acZ and 
TELCeBSFBASALF cells and 1/200 of RNA from FLYA4lacZ3 cells 
The primer combinations for RTVL-H, C- and D-type RNA did 
not give detectable PCR product. 

Table 7. rt-PCR detection of endogenous retrovirus RNA 
associated with virus particles. 



rt-pcr of virion associated RNA from 
primer (5'-3<> GP+Emi2 ""^ 

fCrWard(F)/reverse < R > lacZ25 lacZ3 BASALF 

MFGnls F) CTCTGGCTCACAGTACGACGTAG + ++ 

lacZ R) CCATCAATCCGGTAGGTTTTCCG 

C-type F) CARRGKTTCAARAACWSYCCCAC 

R) AGYARVGTAGCNGGGTTHAGG 

D-type F) TCCCCTTGGAATACTCCTGTTTTYGT - 

R) CATTCCTTGTGGTAAAACTTTCCAYTG 

RTVL-H F) CCTCACCCTGATCACRYTTG nt 

R) GAATTATGTCTGACAGAAGGG 

VL30 F) GTTGACATCTGCAGAGAAAGACC ++ m 

R) TCTGAGGTCTGTACACACAATGG 
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a:-, not detected; + detected in 1/20 RNA preparation; ++ detected in 1/200 RNA 
preparation; NT, not tested because the cells do not possess the corresponding 
genes . 

EXAMPLE 13. 



Generation of gag-pol pre-packaging cells by using TE671 
cells . 

CeB, a plasmid designed to over-express MoMLV gag and pol 
proteins was introduced in TE671 human rhabdomyosarcoma 
cells (ATCC CRL8805) . After selection with blasticidin, 50 
bsr-positive colonies were isolated and the RT (reverse 
transcriptase) activity was analysed in their supernatants . 
12 TE671-CeB (TECeB) clones with high RT activity were 
selected for further analysis. The best TECeB clone, clone 
#15, had a RT activity roughly equivalent to that TELCeB6 
cells (Cosset et al., J. Virol. 69:7430-7436 (1995); see 
also Example 7, Table 6 in this patent application) but 
displayed 2-3 fold more gag-precursors into cells as 
demonstrated in immunoblots by using anti-CA antibodies. The 
biological activity of gag-pol proteins expressed in the six 
best TECeB clones was further confirmed by their ability to 
produce infectious retroviruses in a complementation assay. 
A lacZ provirus was introduced into each of the TECeB clones 
by polyclonal cross-infection by using lacZ(RD114) helper- 
free retrovirus vectors. FBMOSALF, a MoMLV env expression 
plasmid (Cosset et al., J. Virol. 69:6314-6322), was then 
transfected in each of the TECeB-lacZ lines and in the 
TELCeB6 cell line for comparison. After selection with 
phleomycin, the titer of lacZ retrovirus vectors was 
determined in the supernantant of pools of phleomycin- 
resistant colonies for each TECEB-lacZ- FBMOSALF lines. A 
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good correlation was found between gag-pol expression" into 
the TE-CeB clones (as determined by RT-assays and anti-gag 
immunoblots) and their ability to release infectious lacZ 
particles. TE-CeB15 cells could release approximately the 
same number of lacZ particles when compared to TELCeB6 cells 
although TELCeB6 cells had the advantage of being selected 
for lacZ expression (Cosset et al., j. virol. 69:7430-7436 
(1995)). TE-CeB15 cells were therefore used to derive 
retroviral packaging cell lines. 

Construction of env-espression plasmids. 

A series of plasmid (Fig. 3) was designed to allow' 
expression of different retroviral envelope genes (isolated 
from MoMLV, GALV -Gibbon Ape Leukemia Virus-, and MLV-10A1) 
FBdelPMOSAF (Fig. 3, nucleotide sequence in -Fig. l 0) and 
FBdelPlOAlSAF, expressing ecotropic MoMLV or MLV-10A1 
envelopes, were generated by replacing the Bglll/cial 
fragment from FBdelPASAF (Cosset et al., J. virol. 69:7430- 
7436 (1995); see also Example 7, Fig. 2 and nucleotide 
sequence in Fig. 9) encompassing most of the env gene and 
splice acceptor site with that of MoMLV (position 5407 to 
7679, Shinnik et al., 1981) or with that of MLV-10A1 (Ott et 
al. , J. Virol. 64:757-766 (1990)). 

Nucleotides 7514-7516 of GALV (Delassus et al., Virology 
173:205-213 (1989)) were mutated by PCR-mediated mutagenesis 
to create a Clal site (AAG to CGA) , thereby introducing a 
conservative modification (a lysine (amino-acid 665 of GALV 
env precursor) to an arginine) . The BamHI/Clal fragment (nts 
4994 (Delassus et al. Virology 173:205-213 (1989)) to 7517) 
was then sub-cloned into FBdelPASAF in which the Bglll/Clal 
encompassing most of the env gene and splice acceptor site 
had been removed. The resulting plasmid, expressing GALV 
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envelope glycoproteins, was named FBdelPGASAF (Fig. 3, 
nucleotide sequence in Fig. 11) . 

CMV10A1 was generated by inserting a Klenow enzyme-filled 
Eagl/Sall fragment from FBdelPlOAlSAF (encompassing 10A1 MLV 
env gene and phleo selectable marker) into hCMV-G digested 
with BamHI and filled with Klenow enzyme. The resulting 
plasmid, CMV10A1 (Fig. 3 and nucleotide sequence in Fig. 13) 
could express 10A1 envelopes under control of the hCMV 
promoter and the phleo selectable marker by translation re- 
initiation. 

Generation of a multi-tropic set of TE671-based retroviral 
packaging lines. 

FBdelPRDSAF (Fig. 3, nucleotide sequence in Fig. 12), 
FBdelPASAF, FBdelPGASAF, FBdelPMOSAF and FBdelPlOAlSAF were 
independently introduced into cells of the TE-CeB15 pre- 
packaging line, expressing MoMLV gag-pol proteins. 
Transfected cells were phleomycin-selected and 15-20 phleo- 
resistant . colonies were isolated for each env-expression 
plasmid transfected. 

Individual colonies were then analysed for expression of 
envelope glycoproteins by immunoblots on cell lysates by 
using antibodies against RD114 SU glycoproteins or against 
Rausher leukemia virus SU (to screen MoMLV, MLV-4070A and 
MLV-10A1 env-producer clones) or against GALV. The best env- 
producer colonies as determined in this assay were further 
analysed by a complementation assay after introducing a lacZ 
retroviral vector. LacZ pseudotypes released from the 
different packaging cell lines were titrated by using NIH 
3T3 cells or TE671 cells as target. Titers higher than lxlO 7 
lacZ i.u./ml were obtained for the best clones. Depending on 
the envelope specificities expressed in these cells, the new 
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TE671-based retroviral packaging- cell lines were named TE- 
FLYE, TE-FLYA, TE-FLYRD, TE-FLY10A1, and TE-FLYGA and could 
express the MoMLV, MLV-4070A, RD114, MLV-10A1, and GALV env 
genes, respectively. 

Assays for detecting replication-competent retroviruses 
(RCRs) were performed in the supernatant s of these cells and 
were negative (less than 1/ml). 

TE671 cells are very potent for transient expression 
resulting in more than 95% of cells expressing transgene 
three days after plasmid transfection (Hatziioannou and 
Cosset, unpublished data, (1996)). The ability of retroviral 
packaging cell lines to transiently produce retroviral 
vectors is of crucial importance for gene therapy 

vectors carrying toxic gene have to be prepared. Transient 
expression of retroviral vectors was comparatively 
determined from cells of the TE-FLYA line and from the BING 
line (Pear et al., Proc Natl Acad Sci USA 90, 8392-6 
(1993)), a retroviral packaging cell line designed to 
transiently express retroviral vectors. Results (Table 8) 
showed that "TE-FLYA cells were more efficient for transient 
expression of a lacZ retroviral vector hence resulting in 
higher titers. 



25 



30 



Table 8. Comparative study of transient production of lacZ 
vectors . 




Cells were transacted by MFG nlslac Z retroviral MM , LuIa Wlch caicium ^ 
prec^tat.on .ethod and titers of of lacZ vectors «=, released in ceil 
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supernatant were determined as lacZ i.u./ml at day 3 following transf ection. The 
relative number of cells (a) (average per microscope field) and the % of 
transfected cells (b) determined after X-gal staining are shown. 

5 Retroviral vectors prepared from TE671-based packaging cell 

lines were analysed for their sensitivity to human- 
complement mediated inactivation . Experiments were conducted 
as previously described (Cosset et al., J. Virol. 69:7430- 
7436 (1995); see also Example 10 in this patent application) 

10 by using three human sera of individual donnors (Table 9) . 

As expected MLV-A prepared from mouse 3T3 cells were highly 
sensitive to inactivation after 1 hr incubation witn sera. 
In contrast, titers of lacZ vectors produced from TE-FLYRD 
cells were 17 to 55% of control incubations, while titers of 

15 lacZ vectors from TE-FLYA cells were 1 to 30% of controls. 



Table 9. Human serum sensitivity of viruses produced from 
TE671-based packaging cell lines. 



Vims from: 


hu56 a 


hu57 a 


BTS a 


3T3/A 


<0.2, <0.2 


<0.2, <0.2 


<0.2, <0.2 


TE-FLYE 


15, 7.8 


16, 11 


48, 60 


TE-FLYA 


1, 0.6 


2.2, 7.1 


28, 19 


TE-FLYRD 


17, 22 


30, 44 


54, 63 



Three human fresh serum samples were tested in duplicate; hu56 (A+) , hu57 (AB+) , 
25 BTS (AB+) . (a) % control (average for FCS and opt i -MEM treatment) is shown. 
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CLAIMS: 



1. A recombinant expression vector comprising a gene of 
interest and a selectable marker gene, wherein the 
selectable marker gene is arranged downstream of the 
gene of interest and a stop codon associated with the 
gene of interest is spaced from a start codon of said 
selectable marker gene at a distance which is 
sufficient to ensure that said selectable marker 
protein is expressed from the corresponding mRNA as a 
result of translation reinitiation. 

2. A recombinant expression vector according to claim 1 
wherein the vector is a viral vector. 



3 . 



4 . 



6. 



7. 



A recombinant expression vector according to claim 2 
wherein the vector is a retroviral vector. 

A recombinant expression vector according to any one 
of claims l to 3 wherein the gene of interest is 
included as part of a viral packaging construct. 

A recombinant expression vector according .to any one 
of the preceding claims wherein the number of 
nucleotides in the space between .the stop codon of the 
gene of interest and the start codon of the selectable 
marker is in the range of from 20 to 200 nucleotides. 

A recombinant expression vector according to claim 5 
wherein the number of nucleotides in the space between 
the stop codon of the gene of interest and the start 
codon of the selectable marker is in the range of from 
60 to 80 nucleotides. 

A process for producing a cell line in which a gene of 
interest is expressed, which process comprises: 

transforming host cells with an expression vector 
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according to any one of the claims 1 to 6; and 

selectable those cells where expression of the 
selection marker gene may be detected. 

8. A process according to claim 7 wherein the host cell 
is a eukaryotic cell. 

9. A host cell transformed with a recombinant expression 
vector according to any one of the claims 1 to 6 . 

10. A retroviral . packaging cell line comprising a host 
cell transformed with a first and a second recombinant 
expression vector, said first recombinant expression 
vector having a packaging-deficient construct 
comprising a viral gag-pol gene and a first selectable 
marker gene downstream thereof, and said second 
recombinant expression vector having a packaging - 
deficient construct comprising a viral env gene and a 
second selectable marker gene downstream thereof ; 
wherein the start codon of the first and second 
selectable markers are spaced from the stop codons of 
the viral gag-pol gene and the viral env gene 
respectively by a distance which ensures that said 
selectable marker protein is expressed from the 
corresponding mRNA as a result of translation 
reinitiation. 

11. A retroviral packaging cell line according to claim 10 
wherein the first selectable marker is a bsr 
selectable marker and the second selectable marker is 
a phleo selectable marker. 

12. A retroviral packaging cell line according to any one 
of claims 10 or 11 wherein the packaging-deficient 
construct comprising the viral gag-pol gene and first 
selectable marker is the CeB (SEQ ID No 2) expression 
construct . 
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13. A retroviral packaging cell line according to any one 
of claims 10 or 11 wherein the packaging-deficient 
construct comprising the viral env gene and second 
selectable marker is the FBdelPASAF (SEQ ID No 5) 
the FBde 1PMOS AF (SEQ ID No 6), the FbdelPGASAF (SEQ ID 
No 7), the FbdelPRDSAF (SEQ ID No 8), the FbdelPXSAF 
(Fig- 3), the FbdelPlOAlSAF (Fig. 3), or the 
FBdelPVSVGSAF (Fig. 3) expression construct. 

14 . A retroviral packaging cell line according to any one 
of claims 10 or 11 wherein the recombinant expression 
vector is a packaging-deficient retroviral helper 



15 



construct . 



A retroviral packaging cell line according to claim 14 
wherein the overlapping sequences between the genomes 
of the retroviral vector and the packaging-deficient 
construct is reduced by minimizing the extent of non- 
coding retroviral sequences in the packaging-deficient 
genome . 

16. A retroviral packaging cell line according to any one 
of claims 10 to 15 wherein the viral gag-pol gene and 
the selectable marker are expressed under the control 
of a non-retroviral promoter. 

17. A retroviral packaging cell line according to claim 16 
wherein the promoter is fused to rabbit beta-1 globin 
intron . 

18. A retroviral packaging cell line according to claim 16 
or claim 17 wherein the promoter is a hCMV promoter. 

19. A retroviral packaging cell line according to any one 
of claims 16 to claim 18 wherein the viral gag-pol 
gene and the selectable marker is a hCMV+intron (SEQ 
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ID No3) or a hCMV+intronkaSD (SEQ ID No 4) expression 
construct . 

20. A retroviral packaging cell line according to anyone 
of claims 10 to 15 wherein the viral env gene and the 
selectable marker are under the control of a non- 
retroviral promoter. 

21. A retroviral packaging cell line according to claim 20 
wherein the promoter is fused to rabbit beta-l globin 
intron. 

22. A retroviral packaging cell line according to claim 20 
or claim 21 wherein the promoter is a hCMV promoter. 

23 . A retroviral packaging cell line according any one of 
claims 20 to 22 wherein the viral env gene and the 
selectable marker is a CMV10A1 (SEQ ID No 9) 
expression construct. 

24. A retroviral packaging cell line according to any one 
of claims 10 to 23 wherein the cell line is the HT1080 
line, the TE671 line, the 3T3 line, the 293 line or 
the MV-1-1U line. 

25. A retroviral packaging cell line according to anyone 
of claims 10 to 24 wherein the retroviral packaging 
cells comprises human HT1080 cells and express RD114 
envelopes . 

26. A retroviral packaging cell line according to anyone 
of claims 10 to 24 wherein the retroviral packaging 
cells comprises human TE671 cells and express RD114 
envelopes . 
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27 



28 



29 



A process for producing a retroviral packaging cell 
line in which a gene of interest in expressed, which 
process comprises : 

transforming host cells with a first and a second 
recombinant expression vector, said first recombinant 
expression vector having a packaging-deficient 
construct comprising a viral gag-pol gene and a first 
selectable marker gene downstream thereof, and said 
second recombinant expression vector having a 
packaging-deficient construct comprising a viral env 
gene and a second selectable marker gene downstream 
thereof; wherein the start codon of the first and 
second selectable markers are spaced from the stop 
codons of the viral gag-pol gene and the viral env 
gene respectively by a distance which ensures that 
said selectable marker protein is expressed from the 
corresponding mRNA as a result of translation 
re initiation; and 

selecting transformed cells which express said 
first and/or second marker genes. 

A packaging deficient construct for use in a process 
according to claim 27, which expresses a viral gag-pol 
gene and a selectable marker wherein a start codon of 
the selectable marker is spaced from a stop codon of 
the viral gag-pol gene by a distance which ensures 
that said selectable marker protein is expressed from 
the corresponding mRNA as a result of translation 
reinitiation. 

A packaging deficient construct for use in a process 
according to claim 27, which expresses a viral env 
gene and a selectable marker gene; wherein a start 
codon of the selectable marker is spaced from a stop 
codon of the viral env gene by a distance which 
ensures that said selectable marker protein is 
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expressed from the corresponding mRNA as a result of 
translation reinitiation. 
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del psi 




V T 



Mo MLV GAG-POL 



SA 



BSR 



pcly A 



pof oene... TCT AGA CTG AC A TG G CG"C 
GTT CAA CGC TCT CAA AAC CCC 77 A 
AAA A7A AGG 77A ACC CGC GAG GCC 
CCC 7AA 

tccccrtaartcttctcatgctcagaggggtcagTac 
tgcttcgcccgcctccagtgcgacccagccagccacc 
AT® AAA ACA 777 AAC A77 7C7... bsrgene 



Figure 1. Schematic structure of CeB expression vector 
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.del psi 



: B29 LTR 





7 1 


r 




ENV — 


PHLEO 


poly A 








SA 







env gene.. J AC GAG CCA TAG 
qqcocc tagtgttgacaattaatcatcggcatagtata 
cggcatagtataatacgactcactataggagggccacc 
ATSG CC AAG TTG ACC.phleo gene 



Figure 2. Schematic structure of FBdelPASF expression vector 
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F3delPASAF 



Qal 



F329 LTR 



Kasl 
v 



40 7C A MLV env 



ohieo - polyA 



Msd/Bgil! 



FBdeiPMOSAF 



F32 3 LTR 



^' 31 Kasi 



Mo MLV env 



^- onleo - poiyA 



Msci/Sgiii 



F3deIPRDSAF 
v 



Kas! 



F329 LTR - sc3c RC » 1 4 env 



ohleo H poiyA 



Msci/Ndei 



FBdelPBSAF 



F329 LTR 



Kast 



H .-slV-5 env 



ohieo - polyA 



Mscl/3amHI 
FBdelPXSAF 



Cal ■< 



Kasl 



F329 LTR 



NZ3 MLV-X env K-i cnieo r- poly, 



Msc!/8gii! 



FBde!P10A1SAF 



F329 LTR 



Gal Kasi 



H 1QA1 MLV env K - fonieo h polyA 



Mscl/Sglll 



FBdelPGASAF 



Gal Kasl 



F329 LT* 



- I GALV e 



nv 



]* H ohieo I- poiyA 



I 

Mscl/3amHl 



F3delPVSVGSAF 



K2S1 



F329 LTR - | vsv-g [ ^L Tonieo H polyA 



Msci/Sglli 



CMV10A1SAF 



Gal 



Kasl 







I t| v ▼ 


hCMV 


10A1 MLV env 


f^— (ohieo )- 




s'o s'a| 

3amH(/Eagl ! 



}- poiyA 



Figure 3. Schematic structure of env expression vectors 
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NGAGCTC AGGACAGGTAGAAAGAATGAATAGAAC AATAAAAGAGACCCTTACTAAATTGA 6 0 
CCTT AGAGACT'GGCTT AAAAGATTGGAGACGCCTCCTATCTCTGGCTTTGTTAAGAGCCA 120 
G AAATACGCCC AACCGTTTTCGGCTC ACCCCATATGAAATCCTTTATGGGGGACCCCCCC 180 
CTTTGTCAACCTTGCTCAATTCCTTCTCCCCCTCCGATCCTAAGACTGATTTACAAGCCC 2 40 
GACTAAAAGGGCTGCAAGGCGTGCAGGCCCAAATCTGGACACCCCTGGCCGAATTGTACC 3 00 
GGCCAGGACATCCACAAACTAGCCACCCATTTCAGGTGGGAGACTCCGTGTACGTCCGGC 3 60 
GGCACCGCTCTCAAGGATTGGAGCCTCGTTGGAAGGGACCTTACATCGTCCTGCTGACCA 420 
CGCCCACCGCC ATAAAGGTTGACGGGATCGCCGCCTGGATTCACGCATCGCACGCC AAGG 480 
C AGCCCC AAAAACCCCTGGACCAGAAACTCCCAAAACCTGGAAGCTCCGCCGTTCGGAGA 5 40 
ACCCTCTTAAGAT AAGACTCTCCCGTGTCTGACTGCTAATCC ACCTTGTCCCTGT ACTAA 600 
CCCAAAATGAAACTCCCAACAGGAATGGTCATTTTATGTAGCCTAATAATAGTTCGGGCA 6 60 
GGGTTTGACGACCCCCGCAAGGCT ATCGCATTAGTACAAAAAC AACATGGTAAACC ATGC 7 20 
GAATGCAGCGGAGGGCAGGTATCCGAGGCCCCACCGAACTCCATCCAACAGGTAACTTGC 7 30 
CC AGGCAAGACGGCCTACTTAATGACCAACCAAAAATGGAAATGCAGAGTCACTCCAAAA 8 40 
ATCTC ACCT AGCGGGGGAGAACTCC AGAACTGCCCCTGTAACACTTTCCAGGACTCGATG 900 
CACAGTTCTTGTTATACTGAATACCGGCAATGCAGGCGAATTAATAAGACATACTACACG 9 60 
GCCACCTTGCTTAAAATACGGTCTGGGAGCCTCAACGAGGTACAGATATTACAAAACCCC 1020 
AATCAGCTCCT AC AGTCCCCTTGTAGGGGCTCTATAAATCAGCCCGTTTGCTGGAGTGGC 1080 
ACAGCCCCCATCCATATCTCCGATGGTGGAGGACCCCTCGATACTAAGAGAGTGTGGACA 1140 
GTCC AAAAAAGGCTAGAAC AAATTC ATAAGGCTATGACTCCTGAACTTC AATACCACCCC 1200 
TTAGCCCTGCCC AAAGTC AG AGATG ACCTTAGCCTTGATGC ACGGACTTTTGAT ATCCTG 1260 
AATACC ACTTTT AGGTTACTCC AGATGTCC AATTTTAGCCTTGCCCAAGATTGTTGGCTC 13 20 
TGTTT AAAACT AGGTACCCCTACCCCTCTTGCGAT ACCC ACTCCCTCTTT AACCTACTCC 13 80 
CTAGC AG ACTCCCTAGCGAATGCCTCCTGTC AGATTAT ACCTCCCCTCTTGGTTC AACCG 1440 
ATGCAGTTCTCCAACTCGTCCTGTTTATCTTCCCCTTTCATTAACGATACGGAACAAATA 1500 
GACTTAGGTGCAGTCACCTTTACTAACTGCACCTCTGTAGCCAATGTCAGTAGTCCTTTA 1560 
TGTGCCCTAAACGGGTCAGTCTTCCTCTGTGGAAATAACATGGC AT AC ACCTATTT ACCC 1620 
CAAAACTGGACCAGACTTTGCGTCCAAGCCTCCCTCCTCCCCGACATTGACATCAACCCG .1680 
GGGGATG AGCC AGTCCCC ATTCCTGCC ATTG ATC ATT AT AT AC ATAGACCT AAACG AGCT 1740" 
GTAC AGTTC ATCCCTTTACTAGCTGGACTGGGAATCACCGCAGC ATTCACCACCGGAGCT 1800 
ACAGGCCTAGGTGTCTCCGTCACCCAGTATACAAAATTATCCCATCAGTTAATATCTGAT 1860 
GTCC AAGTCTTATCCGGTACC ATAC AAGATTTACAAGACC AGGT AGACTCGTTAGCTGAA 1920 
GTAGTTCTCC AAAATAGGAGGGGACTGGACCT ACTAACGGCAGAACAAGGAGGAATTTGT 1980 
TTAGCCTTAC AAG AAAAATGCTGTTTTTATGCTAACAAGTCAGGAATTGTGAGAAAC AAA 2040 
ATAAGAACCCTACAAGAAGAATTACAAAAACGCAGGGAAAGCCTGGCAACCAACCCTCTC 2100 
TGGACCGGGCTGC AGGGCTTTCTTCCGTACCTCCT ACCTCTCCTGGGACCCCTACTCACC 2160 
CTCCTACTC ATACTAACCATTGGGCCATGCGTTTTC AGTCGCCTCATGGCCTTC ATT AAT 2220 
GATAGACTTAATGTTGTAC ATGCCATGGTGCTGGCCC AGCAATACCAAGC ACTCAAAGCT 2280 
GAGGAAG AAGCTC AGGATTGAGCTTCCGGGACAAAAGC AGGGGGGAATGAGAAGTC AGAA 2340 
CCCCCCACCTTTGCTACATAAATAACCGCTTTCATTTCGCTTCTGTAAAACGCTTATGCG 2 400 
CCCCACCCTAGCCGGAAAGTCCCCAGCCGCTACGCAACCCGGGCCCCGAGTTGCATCAGC 2 460 
CGTTCGC AACCCGGGCTCCGAGTTGC ATC AGCCGAAAGAAACTTCATTTCCC AAGCTT 2518 



Fig. 4 
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AATGAAAGAC CCCACCTGTA GGTTTGGCAA GCTAGCTTAA GTAACGCCAT TTTGCAAGGC - 60 

ATGGAAAAAT ACATAACTGA GAATAGAGAA GTTCAGATCA AGGTCAGGAA CAGATGGAAC 120 

AGCTGAATAT GGGCCAAACA GGATATCTGT GGTAAGCAGT TCCTGCCCCG GCTCAGGGCC 180 

AAGAACAGAT GGAACAGCTG AATATGGGCC AAACAGGATA TCTGTGGTAA GCAGTTCCTG 240 

CCCCGGCTCA GGGCCAAGAA CAGATGGTCC CCAGATGCGG TCCAGCCCTC AGCAGTTTCT 300 

AGAGAACCAT CAGATGTTTC CAGGGTGCCC CAAGGACCTG AAATGACCCT GTGCCTTATT 3 60 

TGAACTAACC AATCAGTTCG CTTCTCGCTT CTGTTCGCGC GCTTCTGCTC CCCGAGCTCA 420 

ATAAAAGAGC CCACAACCCC TCACTCGGGG CGCCAGTCCT CCGATTGACT GAGTCGCCCG 480 

GGTACCCGTG TATCCAATAA ACCCTCTTGC AGTTGCATCC GACTTGTGGT CTCGCTGTTC 540 

CTTGGGAGGG TCTCCTCTGA GTGATTGACT ACCCGTCAGC GGGGGTCTTT CATTTGGGGG 600 

CTCGTCCGGG ATCGGGAGAC CCCTGCCCAG GGACCACCGA CCCACCACCG GGAGGTAAGC 660 

TGGAAGCTTC TGCAGCATCG TTCTGTGTTG TCTCTGTCTG ACTGTGTTTC TGTATTTGTC 720 

TGAGAATATG GGCCAGACTG TTACCACTCC CTTAAGTTTG ACCTTAGGTC ACTGGAAAGA 780 

TGTCGAGCGG ATCGCTCACA ACCAGTCGGT AGATGTCAAG AAGAGACGTT GGGTTACCTT 840 

CTGCTCTGCA GAATGGCCAA CCTTTAACGT CGGATGGCCG CGAGACGGCA CCTTTAACCG 900 

AGACCTCATC ACCCAGGTTA AGATCAAGGT CTTTTCACCT GGCCCGCATG GACACCCAGA 960 

CCAGGTCCCC TACATCGTGA CCTGGGAAGC CTTGGCTTTT GACCCCCCTC CCTGGGTCAA 1020 

GCCCTTTGTA CACCCTAAGC CTCCGCCTCC TCTTCCTCCA TCCGCCCCGT CTCTCCCCCT 1080 

TGAACCTCCT CGTTCGACCC CGCCTCGATC CTCCCTTTAT CCAGCCCTCA CTCCTTCTCT 1140 

AGGCGCCAAA CCTAAACCTC AAGTTCTTTC TGACAGTGGG GGGCCGCTCA TCGACCTACT 1200 

TACAGAAGAC CCCCCGCCTT ATAGGGACCC AAGACCACCC CCTTCCGACA GGGACGGAAA 1260 

TGGTGGAGAA GCGACCCCTG CGGGAGAGGC ACCGGACCCC TCCCCAATGG CATCTCGCCT 1320 

ACGTGGGAGA CGGGAGCCCC CTGTGGCCGA CTCCACTACC TCGCAGGCAT TCCCCCTCCG 1380 

CGCAGGAGGA AACGGACAGC TTCAATACTG GCCGTTCTCC TCTTCTGACC TTTACAACTG 1440 

GAAAAATAAT AACCCTTCTT TTTCTGAAGA TCCAGGTAAA CTGACAGCTC TGATCGAGTC 1500 

TGTTCTCATC ACCCATCAGC CCACCTGGGA CGACTGTCAG CAGCTGTTGG GGACTCTGCT 1560 

GACCGGAGAA GAAAAACAAC GGGTGCTCTT AGAGGCTAGA ■ AAGGCGGTGC GGGGCGATGA 162 0 

TGGGCGCCCC ACTCAACTGC CCAATGAAGT CGATGCCGCT TTTCCCCTCG AGCGCCCAGA 1680 

CTGGGATTAC ACCACCCAGG CAGGTAGGAA CCACCTAGTC CACTATCGCC AGTTGCTCCT 1740 

AGCGGGTCTC CAAAACGCGG GCAGAAGCCC CACCAATTTG GCCAAGGTAA AAGGAATAAC 1800 

ACAAGGGCCC AATGAGTCTC CCTCGGCCTT CCTAGAGAGA CTTAAGGAAG CCTATCGCAG 1860 

GTACACTCCT TATGACCCTG AGGACCCAGG GCAAGAAACT AATGTGTCTA TGTCTTTCAT 1920 

TTGGCAGTCT GCCCCAGACA TTGGGAGAAA GTTAGAGAGG TTAGAAGATT TAAAAAACAA 1980 

GACGCTTGGA GATTTGGTTA GAGAGGCAGA AAAGATCTTT AATAAACGAG AAACCCCGGA 2040 

AGAAAGAGAG GAACGTATCA GGAGAGAAAC AGAGGAAAAA GAAGAACGCC GTAGGACAGA 2100 

GGATGAGCAG AAAGAGAAAG AAAGAGATCG TAGGAGACAT AGAGAGATGA GCAAGCTATT 2160 

GGCCACTGTC GTTAGTGGAC AGAAACAGGA TAGACAGGGA GGAGAACGAA GGAGGTCCCA 2220 

ACTCGATCGC GACCAGTGTG CCTACTGCAA AGAAAAGGGG CACTGGGCTA AAGATTGTCC 2280 

CAAGAAACCA CGAGGACCTC GGGGACCAAG ACCCCAGACC TCCCTCCTGA CCCTAGATGA 2340 

CTAGGGAGGT CAGGGTCAGG AGCCCCCCCC TGAACCCAGG ATAACCCTCA AAGTCGGGGG 2400 

GCAACCCGTC ACCTTCCTGG TAGATACTGG GGCCCAACAC TCCGTGCTGA CCCAAAATCC 2460 

TGGACCCCTA AGTGATAAGT CTGCCTGGGT CCAAGGGGCT ACTGGAGGAA AGCGGTATCG 2520 

CTGGACCACG GATCGCAAAG TACATCTAGC TACCGGTAAG GTCACCCACT CTTTCCTCCA 2580 

TGTACCAGAC TGTCCCTATC CTCTGTTAGG AAGAGATTTG CTGACTAAAC TAAAAGCCCA 2 640 

AATCCACTTT GAGGGATCAG GAGCTCAGGT TATGGGACCA ATGGGGCAGC CCCTGCAAGT 2700 

GTTGACCCTA AATATAGAAG ATGAGCATCG GCTACATGAG ACCTCAAAAG AGCCAGATGT 2760 

TTCTCTAGGG TCCACATGGC TGTCTGATTT TCCTCAGGCC TGGGCGGAAA CCGGGGGCAT 2820 

GGGACTGGCA GTTCGCCAAG CTCCTCTGAT CATACCTCTG AAAGCAACCT CTACCCCCGT 2880 

GTCCATAAAA CAATACCCCA TGTCACAAGA AGCCAGACTG GGGATCAAGC CCCACATACA 2940 

GAGACTGTTG GACCAGGGAA TACTGGTACC CTGCCAGTCC CCCTGGAACA CGCCCCTGCT 3 000 

ACCCGTTAAG AAACCAGGGA CTAATGATTA TAGGCCTGTC CAGGATCTGA GAGAAGTCAA 3060 

CAAGCGGGTG GAAGACATCC ACCCCACCGT GCCCAACCCT TACAACCTCT TGAGCGGGCT 3120 

CCCACCGTCC CACCAGTGGT ACACTGTGCT TGATTTAAAG GATGCCTTTT TCTGCCTGAG 3180 

ACTCCACCCC ACCAGTCAGC CTCTCTTCGC CTTTGAGTGG AGAGATCCAG AGATGGGAAT 3 240 

CTCAGGACAA TTGACCTGGA CCAGACTCCC ACAGGGTTTC AAAAACAGTC CCACCCTGTT 3300 

TGATGAGGCA CTGCACAGAG ACCTAGCAGA CTTCCGGATC CAGCACCCAG ACTTGATCCT 33 60 

GCTACAGTAC GTGGATGACT TACTGCTGGC CGCCACTTCT GAGCTAGACT GCCAACAAGG 3420 

TACTCGGGCC CTGTTACAAA CCCTAGGGAA CCTCGGGTAT CGGGCCTCGG CCAAGAAAGC 3480 

CCAAATTTGC CAGAAACAGG TCAAGTATCT GGGGTATCTT CTAAAAGAGG GTCAGAGATG 3 540 

GCTGACTGAG GCCAGAAAAG AGACTGTGAT GGGGCAGCCT ACTCCGAAGA CCCCTCGACA 3 600 

ACTAAGGGAG TTCCTAGGGA CGGCAGGCTT CTGTCGCCTC TGGATCCCTG GGTTTGCAGA 3660 

AATGGCAGCC CCCTTGTACC CTCTCACCAA AACGGGGACT CTGTTTAATT GGGGCCCAGA 3720 

CCAACAAAAG GCCTATCAAG AAATCAAGCA AGCTCTTCTA ACTGCCCCAG CCCTGGGGTT 3780 

GCCAGATTTG ACTAAGCCCT TTGAACTCTT TGTCGACGAG AAGCAGGGCT ACGCCAAAGG 3 840 

TGTCCTAACG CAAAAACTGG GACCTTGGCG TCGGCCGGTG GCCTACCTGT CCAAAAAGCT 3 900 

AGACCCAGTA GCAGCTGGGT GGCCCCCTTG CCTACGGATG GTAGCAGCCA TTGCCGTACT 3960 

GACAAAGGAT GCAGGCAAGC TAACCATGGG ACAGCCACTA GTCATTCTGG CCCCCCATGC 4020 

AGTAGAGGCA CTAGTCAAAC AACCCCCCGA CCGCTGGCTT TCCAACGCCC GGATGACTCA 4080 
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CTATCAGGCC 

CCCGGCTACG 

GGCCGAAGCC 

CACCTGGTAC 

GGTGACCACC 

GCGGGCTGAA 

TGTTTATACT 

AAGGCGTGGG 

CCTACTAAAA 

AAAGGGACAC 

AGCCATCACA 

CTCAGAACAT 

TTATGATAAA 

TACTTTTGAA 

GGCTCTCCTA 

AAATATCACT 

ACAGGGAACT 

GATAAAGCCC 

CTGGATAGAA 

AGAGGAGATC 

CTTCGTCTCC 

TTGTGCATAC 

GACTTTAACT 

AGCCCTGTAC 

ATATGGGGCA 

CAGCCCCTCT 

ACCTCTGGCG 

AGTCGGCGAC 

AGGACCTTAC 

TTGGATACAC 

GACATGGCGC 

CTAATCCCCT 

GCGGCCCAGC 

GAAGTAGCGA 

GCAATTCGTA 

CGAGTAACTG 

GATTTTGACA 

CGAGTGGTAA 

TTTGTGTTAA 

CTCAAATATA 

AAAGACAGGA 

CCTATAGAGT 

GGAATGAAAG 

GCATGGAAAA 

ACAGTCGAGA 

AATTTCACAA 

AATGTATCTT 

CTCCTCTACT 

GTTAATTAGG 

AAGGGTAATT 

AAACAGCCCA 

AACACCCTGC 

CACATTTTCC 

AGGAACCCAG 

GTTCATCTGC 

GGTCCTGTAG 

ATGAAAATTT 

TGAATGCAAG 

CCACATCAAA 



TTGCTTTTGG 

CTGCTCCCAC 

CACGGAACCC 

ACGGATGGAA 

GAGACCGAGG 

CTGATAGCAC 

GATAGCCGTT 

TTGCTCACAT 

GCCCTCTTTC 

AGCGCCGAGG 

GAGACTCCAG 

TTTCATTACA 

ACAAAGAAGT 

TTATTAGACT 

GAGAGAAGCC 

GAGACCTGCA 

AGGGTCCGCG 

GGATTGTATG 

GCCTTCCCAA 

TTCCCCAGGT 

AAGGTGAGTC 

AGACCCCAAA 

AAATTAACGC 

CGAGCCCGCA 

CCCCCGCCCC 

CTCCAAGCTC 

GCAGCCTACC 

ACAGTGTGGG 

ACAGTCCTGC 

GCCGCCCACG 

GTTCAACGCT 

TAATTCTTCT 

CGGCCACCAT 

CAGAGAAGAT 

CGAAAACAGG 

TTTGTGCAGA 

CGATTGTAGC 

GTCCTTGTGG 

TAGAAATGAA 

CCCGAAATTA 

TATCAGTGGT 

ACGAGCCATA 

ACCCCACCTG 

ATACATAACT 

ACTTGTTTAT 

ATAAAGCATT 

ATCATGTCTG 

TGAGAGGACA 

TCACTTAACA 

TTAAAATATC 

CAAATGTCAA 

TCATCAAGAA 

CCACCTGTGT 

CACTCCACTG 

TGACTGTCAA 

TTTGCTAACA 

GACCCTTGAA 

TTTAACATAG 

ATATTTCCAC 



ACACGGACCG 

TGCCTGAGGA 

GACCCGACCT 

GCAGTCTCTT 

TAATCTGGGC 

TCACCCAGGC 

ATGCTTTTGC 

CAGAAGGCAA 

TGCCCAAAAG 

CTAGAGGCAA 

ACACCTCTAC 

CAGTGACTGA 

ATTGGGTCTA 

TTCTTCATCA 

ACAGTCCCTA 

AAGCTTGTGC 

GGCATCGGCC 

GCTATAAATA 

CCAAGAAAGA 

TCGGCATGCC 

AGACAGTGGC 

GCTCAGGCCA 

TTGCAACTGG 

ACACGCCGGG 

TTGTAAACTT 

ACTTACAGGC 

AAGAACAACT 

TCCGCCGACA 

TGACCACCCC 

TGAAGGCTGC 

CTCAAAACCC 

GATGCTCAGA 

GAAAACATTT 

TACAATGCTT 

AGAAATCATT 

AGCCATTGCG 

TGTTAGACAC 

TATGTGTAGG 

TGGCAAGTTA 

AAAGTTTTAC 

CCAGGCTCTA 

GATAAAATAA 

TAGGTTTGGC 

GAGAATAGAG 

TGCAGCTTAT 

TTTTTCACTG 

GATCCCCAGG 

TTCCAATCAT 

AAAAGGAAAT 

TGGGAAGTCC 

CAGCAGAAAC 

GCACTGTGGT 

AGGTTCCAAA 

GATAAGCATT 

CTGTAGCATT 

CACCCTGCAG 

TGGGTTTTCC 

CAGTTACCCC 

AGGTTAAGTC 



GGTCCAGTTC 

AGGGCTGCAA 

AACGGACCAG 

ACAAGAGGGA 

TAAAGCCCTG 

CCTAAAGATG 

TACTGCCCAT 

AGAGATCAAA 

ACTTAGCATA 

CCGGATGGCT 

CCTCCTCATA 

TATAAAGGAC 

CCAAGGAAAA 

GCTGACTCAC 

CTACATGCTG 

ACAAGTCAAC 

CGGCACTCAT 

TCTTCTAGTT 

AACCGCCAAG 

TCAGGTATTG 

CGATCTGTTG 

GGTAGAAAGA 

CTCTAGAGAC 

CCCCCATGGC 

CCCTGACCCT 

TCTCTACTTA 

GGACCGACCG 

CCAGACTAAG 

CACCGCCCTC 

CGACCCCGGG 

CTTAAAAATA 

GGGGTCAGTA 

AACATTTCTC 

TATGAGGATA 

TCGGCAGTAC 

ATTGGTAGTG 

CCTTATTCTG 

GAGTTGATTT 

GTCAAAACTA 

CACCAAGCTT 

GTTTTGACTC 

AAGATTTTAT 

AAGCTAGCTT 

AAGTTCAGAT 

AATGGTTACA 
CATTCTAGTT 

AAGCTCCTCT 
AGGCTGCCCA 
TGGGTAGGGG 
CTTCCACTGC 
ATACAAGCTG 
TGCTGTGTTA 
ATATCTAGTG 
ATCCTTATCC 
TTTTGGGGTT 
CTCCAAAGGT 
AGCACCATTT 
AATAACCTCA 
CTCATTTAAA 



GGACCGGTGG 

GACAACTGCC 

CCGCTCCCAG 

CAGCGTAAGG 

CCAGCCGGGA 

GCAGAAGGTA 

ATCCATGGAG 

AATAAAGACG 

ATCCATTGTC 

GACCAAGCGG 

GAAAATTCAT 

CTAACCAAGT 

CCTGTGATGC 

CTCAGCTTCT 

AACCGGGATC 

GCCAGCAAGT 

TGGGAGATCG 

TTTATAGATA 

GTCGTAACCA 

GGAACTGACA 

GGGATTGATT 

ATGAATAGAA 

TGGGTGCTCC 

CTCACCCCAT 

GACATGACAA 

GTCCAGCACG 

GTGGTACCTC 

AACCTAGAAC 

AAAGTAGACG 

GGTGGACCAT 

AGGTTAACCC 

CTGCTTCGCC 

AACAAGATCT 

ATAAACATCA 

ATATTGAAGC 

CAGTTTCGAA 

ACGAAGTAGA 

CAGACTATGC 

CGATTGAAGA 

ATCGATTAGT 

AACAATATCA 

TTAGTCTCCA 

AAGTAACGCC 

CAAGGTCAGG 

AATAAAGCAA 

GTGGTTTGTC 

GTGTCCTCAT 

TCCACCCTCT 

TTTTTCACAG 

TGTGTTCCAG 

TCAGCTTTGC 

GTAATGTGCA 

TTTTCATTTT 

AAAACAGCCT 

ACAGTTTGAG 

TCCCCACCAA 

TCATGAGTTT 

GTTTTAACAG 

TTAGGCAAAG 



TAGCCCTGAA 4140 
TTGATATCCT " 4200 

ACGCCGACCA 4260 

CGGGAGCTGC 4320 

CATCCGCTCA 4380 

AGAAGCTAAA 4440 

AAATATAC AG 4500 

AGATCTTGGC 4560 

CAGGACATCA 4620 

CCCGAAAGGC 4680 

CACCCTACAC 4740 

TGGGGGCCAT 4800 

CTGACCAGTT 4860 

CAAAAATGAA 4920 

GAACACTCAA 4980 

CTGCCGTTAA 5040 

ATTTCACCGA 5100 

CCTTTTCTGG 5160 

AGAAGCTACT 5220 

ATGGGCCTGC 5280 

GGAAATTACA 5340 

CCATCAAGGA 5400 

TACTCCCCTT 5460 

ATGAGATCTT 5520 

GAGTTACTAA 5580 

AAGTCTGGAG 5640 

ACCCTTACCG 5700 

CTCGCTGGAA 5760 

GCATCGCAGC 5820 

CCTCTAGACT 5 880 

GCGAGGCCCC 5940 

CGGCTCCAGT 6000 

AGAATTAGTA 6060 

TGTGGGAGCG 6120 

GTATATAGGA 6180 

TGGACAAAAG 6240 

TAGAAGTATT 6300 

ACCAGATTGT 6360 

ACTCATTCCA 6420 

CCAATTTGTT 6480 

CCAGCTGAAG 6540 

GAAAAAGGGG 6600 

ATTTTGCAAG 6660 

AACAGATGGA 6720 

TAGCATCACA 6780 

CAAACTCATC 6840 

AAACCCTAAC 6900 

GTGTCCTCCT 6960 

ACCGCTTTCT 7020 

AAGTGTTGGT 7080 

ACAAGGGCCC 7140 

AAACAGGAGG 7200 

TACTTGGATC 72 60 

TGTGGTCAGT 7320 

CAGGATATTT 73 80 

CAGCAAAAAA 7440 

TTTGTGTCCC 7500 

TAACAGCTTC 75 60 

GAATTC 7616 
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Figure 7. hCMV+intron Sequence 1 



AGATCTCCCG ATCCCCTATG GTCGACTCTC AGTACAATCT GCTCTGATGC CGCATAGTTA - 60 

AGCCAGTATC TGCTCCCTGC TTGTGTGTTG GAGGTCGCTG AGTAGTGCGC GAGCAAAATT 120 

TAAGCTACAA CAAGGCAAGG CTTGACCGAC AATTGCATGA AGAATCTGCT TAGGGTTAGG 180 

CGTTTTGCGC TGCTTCGCGA TGTACGGGCC AGATATACGC GTTGACATTG ATTATTGACT 240 

AGTTATTAAT AGTAATCAAT TACGGGGTCA TTAGTTCATA GCCCATATAT GGAGTTCCGC 3 00 

GTTACATAAC TTACGGTAAA TGGCCCGCCT GGCTGACCGC CCAACGACCC CCGCCCATTG 3 60 

ACGTCAATAA TGACGTATGT TCCCATAGTA ACGCCAATAG GGACTTTCCA TTGACGTCAA 420 

TGGGTGGACT ATTTACGGTA AACTGCCCAC TTGGCAGTAC ATCAAGTGTA TCATATGCCA 480 

AGTACGCCCC CTATTGACGT CAATGACGGT AAATGGCCCG CCTGGCATTA TGCCCAGTAC 540 

ATGACCTTAT GGGACTTTCC TACTTGGCAG TACATCTACG TATTAGTCAT CGCTATTACC 600 

ATGGTGATGC GGTTTTGGCA GTACATCAAT GGGCGTGGAT AGCGGTTTGA CTCACGGGGA 660 

TTTCCAAGTC TCCACCCCAT TGACGTCAAT GGGAGTTTGT TTTGGCACCA AAATCAACGG 720 

GACTTTCCAA AATGTCGTAA CAACTCCGCC CCATTGACGC AAATGGGCGG TAGGCGTGTA 780 

CGGTGGGAGG TCTATATAAG CAGAGCTCTC TGGCTAACTA GAGAACCCAC TGCTTAACTG 840 

GCTTATCGAA ATGTCGACTG AGAACTTCAG GGTGAGTTTG GGGACCCTTG ATTGTTCTTT 900 

CTTTTTCGCT ATTGTAAAAT TCATGTTATA TGGAGGGGGC AAAGTTTTCA GGGTGTTGTT 960 

TAGAATGGGA AGATGTCCCT TGTATCACCA TGGACCCTCA TGATAATTTT GTTTCTTTCA 1020 

CTTTCTACTC TGTTGACAAC CATTGTCTCC TCTTATTTTC TTTTCATTTT CTGTAACTTT 1080 

TTCGTTAAAC TTTAGCTTGC ATTTGTAACG AATTTTTAAA TTCACTTTTG TTTATTTGTC 1140 

AGATTGTAAG TACTTTCTCT AATCACTTTT TTTTCAAGGC AATCAGGGTA TATTATATTG 1200 

TACTTCAGCA CAGTTTTAGA GAACAATTGT TATAATTAAA TGATAAGGTA GAATATTTCT 1260 

GCATATAAAT TCTGGCTGGC GTGGAAATAT TCTTATTGGT AGAAACAACT ACATCCTGGT 1320 

CATCATCCTG CCTTTCTCTT TATGGTTACA ATGATATACA CTGTTTGAGA TGAGGATAAA 1380 

ATACTCTGAG TCCAAACCGG GCCCCTCTGC TAACCATGTT CATGCCTTCT TCTTTTTCCT 1440 

ACAGCTCCTG GGCAACGTGC TGGTTGTTGT GCTGTCTCAT CATTTTGGCA AGAATTGGCC 1500 

GCAAGCTTCT GCAGCATCGT TCTGTGTTGT CTCTGTCTGA CTGTGTTTCT GTATTTGTCT 1560 

GAGAATATGG GCCAGACTGT TACCACTCCC TTAAGTTTGA CCTTAGGTCA CTGGAAAGAT 1620 

GTCGAGCGGA TCGCTCACAA CCAGTCGGTA GATGTCAAGA AGAGACGTTG GGTTACCTTC 16 80 

TGCTCTGCAG AATGGCCAAC CTTTAACGTC GGATGGCCGC GAGACGGCAC CTTTAACCGA 1740 

GACCTCATCA CCCAGGTTAA GATCAAGGTC TTTTCACCTG GCCCGCATGG ACACCCAGAC 1800 

CAGGTCCCCT ACATCGTGAC CTGGGAAGCC TTGGCTTTTG ACCCCCCTCC CTGGGTCAAG 1860 

CCCTTTGTAC ACCCTAAGCC TCCGCCTCCT CTTCCTCCAT CCGCCCCGTC TCTCCCCCTT 1920 

GAACCTCCTC GTTCGACCCC GCCTCGATCC TCCCTTTATC CAGCCCTCAC TCCTTCTCTA 1980 

GGCGCCAAAC CTAAACCTCA AGTTCTTTCT GACAGTGGGG GGCCGCTCAT CGACCTACTT 2040 

ACAGAAGACC CCCCGCCTTA TAGGGACCCA AGACCACCCC CTTCCGACAG GGACGGAAAT 2100 

GGTGGAGAAG CGACCCCTGC GGGAGAGGCA CCGGACCCCT CCCCAATGGC ATCTCGCCTA 2160 

CGTGGGAGAC GGGAGCCCCC TGTGGCCGAC TCCACTACCT CGCAGGCATT CCCCCTCCGC 2220 

GCAGGAGGAA ACGGACAGCT TCAATACTGG CCGTTCTCCT CTTCTGACCT TTACAACTGG 2280 

AAAAATAATA ACCCTTCTTT TTCTGAAGAT CCAGGTAAAC TGACAGCTCT GATCGAGTCT 2340 

GTTCTCATCA CCCATCAGCC CACCTGGGAC GACTGTCAGC AGCTGTTGGG GACTCTGCTG 2400 

ACCGGAGAAG AAAAACAACG GGTGCTCTTA GAGGCTAGAA AGGCGGTGCG GGGCGATGAT 2460 

GGGCGCCCCA CTCAACTGCC CAATGAAGTC GATGCCGCTT TTCCCCTCGA GCGCCCAGAC 2520 

TGGGATTACA CCACCCAGGC AGGTAGGAAC CACCTAGTCC ACTATCGCCA GTTGCTCCTA 2580 

GCGGGTCTCC AAAACGCGGG CAGAAGCCCC ACCAATTTGG CCAAGGTAAA AGGAATAACA 2640 

CAAGGGCCCA ATGAGTCTCC CTCGGCCTTC CTAGAGAGAC TTAAGGAAGC CTATCGCAGG 2700 

TACACTCCTT ATGACCCTGA GGACCCAGGG CAAGAAACTA ATGTGTCTAT GTCTTTCATT 2760 

TGGCAGTCTG CCCCAGACAT TGGGAGAAAG TTAGAGAGGT TAGAAGATTT AAAAAACAAG 2820 

ACGCTTGGAG ATTTGGTTAG AGAGGCAGAA AAGATCTTTA ATAAACGAGA AACCCCGGAA 2880 

GAAAGAGAGG AACGTATCAG GAGAGAAACA GAGGAAAAAG AAGAACGCCG TAGGACAGAG 2940 

GATGAGCAGA AAGAGAAAGA AAGAGATCGT AGGAGACATA GAGAGATGAG CAAGCTATTG 3000 

GCCACTGTCG TTAGTGGACA GAAACAGGAT AGACAGGGAG GAGAACGAAG GAGGTCCCAA 3060 

CTCGATCGCG ACCAGTGTGC CTACTGCAAA GAAAAGGGGC ACTGGGCTAA AGATTGTCCC 3120 

AAGAAACCAC GAGGACCTCG GGGACCAAGA CCCCAGACCT CCCTCCTGAC CCTAGATGAC 3180 

TAGGGAGGTC AGGGTCAGGA GCCCCCCCCT GAACCCAGGA TAACCCTCAA AGTCGGGGGG 3240 

CAACCCGTCA CCTTCCTGGT AGATACTGGG GCCCAACACT CCGTGCTGAC CCAAAATCCT 3 300 

GGACCCCTAA GTGATAAGTC TGCCTGGGTC CAAGGGGCTA CTGGAGGAAA GCGGTATCGC 3 360 

TGGACCACGG ATCGCAAAGT ACATCTAGCT ACCGGTAAGG TCACCCACTC TTTCCTCCAT 3420 

GTACCAGACT GTCCCTATCC TCTGTTAGGA AGAGATTTGC TGACTAAACT AAAAGCCCAA 3480 

ATCCACTTTG AGGGATCAGG AGCTCAGGTT ATGGGACCAA TGGGGCAGCC CCTGCAAGTG 3 540 

TTGACCCTAA ATATAGAAGA TGAGCATCGG CTACATGAGA CCTCAAAAGA GCCAGATGTT 3 600 

TCTCTAGGGT CCACATGGCT GTCTGATTTT CCTCAGGCCT GGGCGGAAAC CGGGGGCATG 3660 

GGACTGGCAG TTCGCCAAGC TCCTCTGATC ATACCTCTGA AAGCAACCTC TACCCCCGTG 3720 

TCCATAAAAC AATACCCCAT GTCACAAGAA GCCAGACTGG GGATCAAGCC CCACATACAG 3780 

AGACTGTTGG ACCAGGGAAT ACTGGTACCC TGCCAGTCCC CCTGGAACAC GCCCCTGCTA 3840 

CCCGTTAAGA AACCAGGGAC TAATGATTAT AGGCCTGTCC AGGATCTGAG AGAAGTCAAC 3 900 

AAGCGGGTGG AAGACATCCA CCCCACCGTG CCCAACCCTT ACAACCTCTT GAGCGGGCTC 3960 

CCACCGTCCC ACCAGTGGTA CACTGTGCTT GATTTAAAGG ATGCCTTTTT CTGCCTGAGA 4020 

CTCCACCCCA CCAGTCAGCC TCTCTTCGCC TTTGAGTGGA GAGATCCAGA GATGGGAATC 4080 
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Z^^^AT TGACC TGGAC CAGACTCCCA CAGGGTTTCA AAAACAGTCC CACCCTGTTT 
GATGAGGCAC TGCACAGAGA CCTAGCAGAC TTCCGGATCC AGCACCCAGA c^TGATCPTr 
7£nnrJnn G TGGATGA CTT ACTGCTGGCC GCCACTTCTG AGCTAGACTG CcllclSZ 
cSaSS CCTAGGGAAC CTCGGGTATC GGGCCTCGG? CaSS 

^^IIZ?? AGAAA CAGGT CAAGTATCTG GGGTATCTTC TAAAAGAGGG TCAGAGATCT 
CTGACTGAGG CCAGAAAAGA GACTGTGATG GGGCAGCCTA CTCCGAAGAC CCCTCGArS 
CTAAGGGAGT TCCTAGGGAC GGCAGGCTTC TGTCGCCTCT GGATCCCTGG SSaSa" 
ATGGCAGCCC CCTTGTACCC TCTCACCAAA ACGGGGACTC TGTTTAATTG gSSSJ 
CAACAAAAGG CCTATCAAGA AATCAAGCAA GCTCTTCTAA ctgccccagc 
SS^T™ CTAAGGG CTT TGAACTCTTT GTCGACGAGA AGcSgS S555? 

^^; AACGC aaaaa ctggg accttggcgt cggccggtgg cctacctgtc cSSSgcS 
> A ^? A oI^ cagctgggtg gcccccttgc ctacggatgg tagcagcca? tgJJSJSg* 
A ™™S caggcaagg t aaccatggga cagccactag tcattctggc cccccatgS 
^ctcaaaca accccccgac cgctggcttt ccaacgcccg gatcactSc 

^^S 7 TGCTTTTGGA CACGGACCGG GTCCAGTTCG GACCGGTGGT AGCCCTGA^C 
2** CTACGC TGG TCCCACT GCCTGAGGAA GGGCTGCAAC ACAACTGCCT TGATATCCTG 
GCCGAAGCCC ACGGAACCCG ACCCGACCTA ACGGACCAGC CGCTCCC^GA SxSacS? 
JSSSE£ CGGATGGAAG CAGTCTCTTA CAAGAGGGAC gISgS 
^5 A ^CG AGACCGAGGT AATCTGGGCT AAAGCCCTGC CAGCCGGGAC AtSgCToS 

tgataggag t cacccaggcc ctaaagatgg cagaagg^aa gaagc™? 

ATAG CCGTTA TGCTTTTCCT ACTGCCCATA TCCATGGAGA AAtSS 
AGGG ^ GGGT TGCTGAGA TC AGAAGGCAAA GAGATCAAAA ATAAAGACGA GATCTTGGCC 
^S^^ 0 CCG TCTTTCT GCCCAAAAGA CTTAGCATAA TCCATTGTCC AGGACAtSa 
AA ? GGA = ACA CCGCCGAGGC TAGAGGCAAC CGGATGGCTG ACCAAGCGGC CCGAAAGGCA 
GCCATCACAG AGACTCCAGA CACCTCTACC CTCCTCATAG AAAATTCATC ACCCTACA^C 
TATr^^T ^ TCATTACAG AGTGACTGAT ATAAAGGACC TAACCAAGTT SgSSSS 
T A ^ AAA ^ CAAAGAAGTA TTGGGTCTAC CAAGGAAAAC CTGTGATGCC TGACCAGTTT 

gc^tc™ JSSSS* tcttcatcag ctgactcacc tcagcptctc aSStcIIg 
GC r:£ cc I AG AGAGAAGG CA cagtccctac tacatgctga accgggatcg aacactcaaa 

^rZriirl* AG £ CCTGCAA AGG TTGTGCA CAAGTCAACG CCAGCAAGTC TGCCgSaII 

n™^ 0 ™ GGGT( =cgcgg gcatcggccc ggcactcatt gggagatcga tttcaccgag 
A ^ GCCCG ga ttgtatgg ctataaatat cttctagttt ttatagatac Stoggc 

r G ^I AG ^2 CCTTCCCAAC CAAGAAAGAA ACCGCCAAGG TCGTAACCAA GAaSSS 
GAGGAGATCT TCCCCAGGTT CGGCATGCCT CAGGTATTGG GAACTGACAA TGGGCCTrrr 
TTCGTCTCCA AGGTGAGTCA GACAGTGGCC GATCTGTTGG GGATTGATTG SKSSSS 
T G I5£t™ GACCCCAAAG CTCAGGCCAG GTAGAAAGAA TGAATAGAAC CATCaISSg 
A ^^ CTA AATTAACGCT TGCAACTGGC TCTAGAGACT GGGTGCTCCT ACTCCCCTTA 
GCGC J G T AGC GA GCCCGCAA CACGCCGGGC CCCCATGGCC TCACCCCATA TGAGATCT^A 63 60 

lnl^^.t C GGGGG CCCCT TGTAAACTTC CCTGACCCTG ACATGACAAG AOTTXcSac* 642^ 

Sc5gg?gg SSSSS c *? acaggct ctctacttag TCCAGCACGA ^80 

^^Tr? 030 CAGG CTACCA AGAACAACTG GACCGACCGG TGGTACCTCA CCCTTACCGA 
rJ^^ ™ CAGTGTGGG T CCGCCGACAC CAGACTAAGA ACCTAGAACC TCGCTgSaa" 
^^ ACA CAGTCC TGCT GACCACCCCC ACCGCCCTCA AAGTAGACGG CATCGCAGCT 

^ g 5 aca cg ccgcccacgt gaaggctgcc gaccccgggg gtggaccatc ctctaSctS 

tcaaaacccc ttaaaaataa ggttaacccg cSgSwS 
^^SS 077 AATTCTTCTG atgctcagag gggtcagtac tgcttcgccc ggctccagtg 

S^AgSgAC 22S5 J***™™ ACATTTCTCA ACAAGATCTA 2S5S5E 6900 

^f;^ 0 ^ AGAGAAGATT ACAATGCTTT ATGAGGATAA TAAACATCAT GTGGGAGCGG 6960 

GAA ^TCGTAC GAAAACAGGA GAAATCATTT CGGCAGTACA TATTGAAGCG TATATAGGAC 7020 

SSiSSE 2K?S££ S^™^ TTGGTAGTGC AGTTTCGAAT GgSS£ tSsS 

A ;Tii GACA S GATTGTAGCT GTTAGACACC CTTATTCTGA CGAAGTAGAT AGAAGTATTC 7140 

SSSSSK ATGTGTAGGG AGTTGATTTC AGACTATGCA OciSSiSS ?|SS 

IXt^Zl^ T AGAAATGAAT GGCAAGTTAG TCAAAACTAC GATTGAAGAA CTCATTCCAC 7?<50 

TCAAATATAC CCGAAATTAA AAGTTTTACC ACCAAGCTTA TCGAATTC tTCA;lTCCAC ^50 



4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 



5540 
6600 
6660 
6720 
6780 
6840 
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Figure 8, hCMV+intronkaSD Sequence 1 

AGATCTCCCG ATCCCCTATG GTCGACTCTC AGTACAATCT GCTCTGATGC CGCATAGTTA . 60 

AGCCAGTATC TGCTCCCTGC TTGTGTGTTG GAGGTCGCTG AGTAGTGCGC GAGCAAAATT 120 

TAAGCTACAA CAAGGCAAGG CTTGACCGAC AATTGCATGA AGAATCTGCT TAGGGTTAGG 180 

CGTTTTGCGC TGCTTCGCGA TGTACGGGCC AGATATACGC GTTGACATTG ATTATTGACT 240 

AGTTATTAAT AGTAATCAAT TACGGGGTCA TTAGTTCATA GCCCATATAT GGAGTTCCGC 300 

GTTACATAAC TTACGGTAAA TGGCCCGCCT GGCTGACCGC CCAACGACCC CCGCCCATTG 3 60 

ACGTCAATAA TGACGTATGT TCCCATAGTA ACGCCAATAG GGACTTTCCA TTGACGTCAA 420 

TGGGTGGACT ATTTACGGTA AACTGCCCAC TTGGCAGTAC ATCAAGTGTA TCATATGCCA 480 

AGTACGCCCC CTATTGACGT CAATGACGGT AAATGGCCCG CCTGGCATTA TGCCCAGTAC 540 

ATGACCTTAT GGGACTTTCC TACTTGGCAG TACATCTACG TATTAGTCAT CGCTATTACC 600 

ATGGTGATGC GGTTTTGGCA GTACATCAAT GGGCGTGGAT AGCGGTTTGA CTCACGGGGA 660 

TTTCCAAGTC TCCACCCCAT TGACGTCAAT GGGAGTTTGT TTTGGCACCA AAATCAACGG 720 

GACTTTCCAA AATGTCGTAA CAACTCCGCC CCATTGACGC AAATGGGCGG TAGGCGTGTA 780 

CGGTGGGAGG TCTATATAAG CAGAGCTCTC TGGCTAACTA GAGAACCCAC TGCTTAACTG 840 

GCTTATCGAA ATGTCGACTG AGAACTTCAG GGTGAGTTTG GGGACCCTTG ATTGTTCTTT 900 

CTTTTTCGCT ATTGTAAAAT TCATGTTATA TGGAGGGGGC AAAGTTTTCA GGGTGTTGTT 960 

TAGAATGGGA AGATGTCCCT TGTATCACCA TGGACCCTCA TGATAATTTT GTTTCTTTCA 1020 

CTTTCTACTC TGTTGACAAC CATTGTCTCC TCTTATTTTC TTTTCATTTT CTGTAACTTT 1080 

TTCGTTAAAC TTTAGCTTGC ATTTGTAACG AATTTTTAAA TTCACTTTTG TTTATTTGTC 1140 

AGATTGTAAG TACTTTCTCT AATCACTTTT TTTTCAAGGC AATCAGGGTA TATTATATTG 1200 

TACTTCAGCA CAGTTTTAGA GAACAATTGT TATAATTAAA TGATAAGGTA GAATATTTCT 12 60 

GCATATAAAT TCTGGCTGGC GTGGAAATAT TCTTATTGGT AGAAACAACT ACATCCTGGT 132 0 

CATCATCCTG CCTTTCTCTT TATGGTTACA ATGATATACA CTGTTTGAGA TGAGGATAAA 13 80 

ATACTCTGAG TCCAAACCGG GCCCCTCTGC TAACCATGTT CATGCCTTCT TCTTTTTCCT 1440 

ACAGCTCCTG GGCAACGTGC TGGTTGTTGT GCTGTCTCAT CATTTTGGCA AGAATTGGCC 1500 

GCAAGCTTCT GCAGCATCGT TCTGTGTTGT CTCTGTCTGA CTGTGTTTCT GTATTTGTCT 1560 

GAGAATATGG GCCAGACTGT TACCACTCCC TTAAGTTTGA CCTTAGGTCA CTGGAAAGAT 1620 

GTCGAGCGGA TCGCTCACAA CCAGTCGGTA GATGTCAAGA AGAGACGTTG GGTTACCTTC 1680 

TGCTCTGCAG AATGGCCAAC CTTTAACGTC GGATGGCCGC GAGACGGCAC CTTTAACCGA 1740 

GACCTCATCA CCCAGGTTAA GATCAAGGTC TTTTCACCTG GCCCGCATGG ACACCCAGAC 1800 

CAGGTCCCCT ACATCGTGAC CTGGGAAGCC TTGGCTTTTG ACCCCCCTCC CTGGGTCAAG 186 0 

CCCTTTGTAC ACCCTAAGCC TCCGCCTCCT CTTCCTCCAT CCGCCCCGTC TCTCCCCCTT 1920 

GAACCTCCTC GTTCGACCCC GCCTCGATCC TCCCTTTATC CAGCCCTCAC TCCTTCTCTA 1980 

GGCGCCAAAC CTAAACCTCA AGTTCTTTCT GACAGTGGGG GGCCGCTCAT CGACCTACTT 2040 

ACAGAAGACC CCCCGCCTTA TAGGGACCCA AGACCACCCC CTTCCGACAG GGACGGAAAT 2100 

GGTGGAGAAG CGACCCCTGC GGGAGAGGCA CCGGACCCCT CCCCAATGGC ATCTCGCCTA 2160 

CGTGGGAGAC GGGAGCCCCC TGTGGCCGAC TCCACTACCT CGCAGGCATT CCCCCTCCGC 222 0 

GCAGGAGGAA ACGGACAGCT TCAATACTGG CCGTTCTCCT CTTCTGACCT TTACAACTGG 22 80 

AAAAATAATA ACCCTTCTTT TTCTGAAGAT CCAGGTAAAC TGACAGCTCT GATCGAGTCT 2340 

GTTCTCATCA CCCATCAGCC GACCTGGGAC GACTGTCAGC AGCTGTTGGG GACTCTGCTG 2400 

ACCGGAGAAG AAAAACAACG GGTGCTCTTA GAGGCTAGAA AGGCGGTGCG GGGCGATGAT 2460 

GGGCGCCCCA CTCAACTGCC CAATGAAGTC GATGCCGCTT TTCCCCTCGA GCGCCCAGAC 2 520 

TGGGATTACA CCACCCAGGC AGGACGCAAC CACCTAGTCC ACTATCGCCA GTTGCTCCTA 2 580 

GCGGGTCTCC AAAACGCGGG CAGAAGCCCC ACCAATTTGG CCAAGGTAAA AGGAATAACA 2 640 

CAAGGGCCCA ATGAGTCTCC CTCGGCCTTC CTAGAGAGAC TTAAGGAAGC CTATCGCAGG 2700 

TACACTCCTT ATGACCCTGA GGACCCAGGG CAAGAAACTA ATGTGTCTAT GTCTTTCATT 2760 

TGGCAGTCTG CCCCAGACAT TGGGAGAAAG TTAGAGAGGT TAGAAGATTT AAAAAACAAG 2820 

ACGCTTGGAG ATTTGGTTAG AGAGGCAGAA AAGATCTTTA ATAAACGAGA AACCCCGGAA 2880 

GAAAGAGAGG AACGTATCAG GAGAGAAACA GAGGAAAAAG AAGAACGCCG TAGGACAGAG 2 940 

GATGAGCAGA AAGAGAAAGA AAGAGATCGT AGGAGACATA GAGAGATGAG CAAGCTATTG 3 000 

GCCACTGTCG TTAGTGGACA GAAACAGGAT AGACAGGGAG GAGAACGAAG GAGGTCCCAA 3 060 

CTCGATCGCG ACCAGTGTGC CTACTGCAAA GAAAAGGGGC ACTGGGCTAA AGATTGTCCC 3120 

AAGAAACCAC GAGGACCTCG GGGACCAAGA CCCCAGACCT CCCTCCTGAC CCTAGATGAC 3180 

TAGGGAGGTC AGGGTCAGGA GCCCCCCCCT GAACCCAGGA TAACCCTCAA AGTCGGGGGG 3240 

CAACCCGTCA CCTTCCTGGT AGATACTGGG GCCCAACACT CCGTGCTGAC CCAAAATCCT 33 00 

GGACCCCTAA GTGATAAGTC TGCCTGGGTC CAAGGGGCTA CTGGAGGAAA GCGGTATCGC 3 3 60 

TGGACCACGG ATCGCAAAGT ACATCTAGCT ACCGGTAAGG TCACCCACTC TTTCCTCCAT 3420 

GTACCAGACT GTCCCTATCC TCTGTTAGGA AGAGATTTGC TGACTAAACT AAAAGCCCAA 3 480 

ATCCACTTTG AGGGATCAGG AGCTCAGGTT ATGGGACCAA TGGGGCAGCC CCTGCAAGTG 3 540 

TTGACCCTAA ATATAGAAGA TGAGCATCGG CTACATGAGA CCTCAAAAGA GCCAGATGTT 3 600 

TCTCTAGGGT CCACATGGCT GTCTGATTTT CCTCAGGCCT GGGCGGAAAC CGGGGGCATG 3 660 

GGACTGGCAG TTCGCCAAGC TCCTCTGATC ATACCTCTGA AAGCAACCTC TACCCCCGTG 3720 

TCCATAAAAC AATACCCCAT GTCACAAGAA GCCAGACTGG GGATCAAGCC CCACATACAG 3780 

AGACTGTTGG ACCAGGGAAT ACTGGTACCC TGCCAGTCCC CCTGGAACAC GCCCCTGCTA 3840 

CCCGTTAAGA AACCAGGGAC TAATGATTAT AGGCCTGTCC AGGATCTGAG AGAAGTCAAC 3 900 

AAGCGGGTGG AAGACATCCA CCCCACCGTG CCCAACCCTT ACAACCTCTT GAGCGGGCTC 3960 

CCACCGTCCC ACCAGTGGTA CACTGTGCTT GATTTAAAGG ATGCCTTTTT CTGCCTGAGA 4020 

CTCCACCCCA CCAGTCAGCC TCTCTTCGCC TTTGAGTGGA GAGATCCAGA GATGGGAATC 4080 
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'^f^AAT TGACCTGGAC CAGACTCCCA CAGGGTTTCA AAAACAGTCC CACCCGTTT 
^t™. TGCACAGAGA CCTAGCAGAC TTCCGGATCC AGCACCCAGA CTTGATCCTG 
C ; AGAG J AGG TGGATGACTT ACTGCTGGCC GCCACTTCTG AGCTAGACTG CcScAAGCT 
ACTCGGGCCC TGTTACAAAC CCTAGGGAAC CTCGGGTATC GGGCCTCGGC cSffiuSS? 
CT^ScAGG C?^ G I CAAGTATCTG GGGTATCTTC tSSgAGGG SS 
CTGACTGAGG CCAGAAAAGA GACTGTGATG GGGCAGCCTA CTCCGAAGAC CCCTCGACAA 
J™^* TCCTAGGGAC GGCAGGCTTC TGTCGCCTCT GGATCCCTGG SSSJ 
ATGGCAGCCC CCTTGTACCC TCTCACCAAA ACGGGGACTC TGTTTAATTG Q3GCCCAGA? 
CAACAAAAGG CCTATCAAGA AATCAAGCAA GCTCTTCTAA CTGCCCCAGC £SS£^£ 
^ AGA I^ GA CTAAGG CCTT TGAACTCTTT GTCGACGAGA AGCAGGGCTA ScSa^S 
GTCCTAACGC AAAAACTGGG ACCTTGGCGT CGGCCGGTGG CCTACCTGTC CAAAAAGCTA 
G ^ CCA ^^ CAGCTGGGTG GCCCCCTTGC CTACGGATGG TAGCAGCCAT TGC^Sg 
ACAAAGGATG CAGGCAAGCT AACCATGGGA CAGCCACTAG TCATTCTGGC CCCCCATGCA 

gtagaggcac tagtcaaaca accccccgac cgctggcttt ccaacgcccg gatcStcac 

^ CAGGG ^ TGCTTTTGGA CACGGACCGG GTCCAGTTCG GACCGGTGGT aSctSc Hlo 
^5S C T AGGG TGCTCCCACT GCCTGAGGAA GGGCTGCAAC ACAACTGCCT TGATATCCTG tain 
GGC . GAAG 5 GC ACGGAACCCG ACCCGACCTA ACGGACCAGC CGCTCCCAGA CGCC^CcIc 
™ET55r AGA CGGATGGAAG CAGTCTCTTA CAAGAGGGAC AGCGTAAGGC GgSSJSS 
G !5 AG 2 ACCG A GACCGAGGT AATCTGGGCT AAAGCCCTGC CAGCCGGGAC ATOCgSSS 
GG ^2 C 3 GAAC ^ATAGCACT CACCCAGGCC CTAAAGATGG CAGAAGGTAA GAAGctIS? 

G ™; A ^ G atagccgtta tgcttttgct actgcccata tccatggaga aataSca^I 

AGGC . G ? GGGT TGCTCACATC AGAAGGCAAA GAGATCAAAA ATAAAGACGA GATCTTGG^ 
^I^™? CCCTCTT TCT GCCCAAAAGA CTTAGCATAA TCCATTGTCC AGGAcItcS 

^CGAGGC TAGAGGCAAC CGGATGGCTG ACCAAGCGGC CcSSSct 
GGCA ? CACAG AGACTCCAGA CACCTCTACC CTCCTCATAG AAAATTCATC MOCTAoK 
^ CAGA ^ CATT TTCATTACAC AGTGACTGAT ATAAAGGACC TAACCAAGTT GGGGGCCATT 
T A ^ AAAA CAAAGAAGTA TTGGGTCTAC CAAGGAAAAC CTGTGATGCC TGACoStS 
AG ™ GAAT TATTAGACTT TCTTCATCAG CTGACTCACC TCAGCTTCTC AAAAATGaIg 
£^Zr^ AGAGAAGCCA CAGTCCCTAC TACATGCTGA ACCGGGATCG AACAC^CAAA 

AGACCTGCAA AGCTTGTGCA CAAGTCAACG CCAGCAAGTC TGCCG'T'TAAA 
a™??™- ^TCCGCGG GCATCGGCCC GGCACTCATT GGGAGATCGA TTTCACCGAG 
ATAAAGCCCG GATTGTATGG CTATAAATAT CTTCTAGTTT TTATAGATAC CTTTTCTCrr 
r*™^ CCTTCCCAA C CAAGAAAGAA ACCGCCAAGG TCGtSa5 SSSSS 
GAGGAGATCT TCCCCAGGTT CGGCATGCCT CAGGTATTGG GAACTGACAA TGGGCCTGCC 

ttcgtctcca aggtgagtca gacagtggcc gatctgttgg ggmtgSS SS 

GACCCCAAAG CTCAGGCCAG GTAGAAAGAA TGAAtSc 
AATTAACGCT TGCAACTGGC TCTAGAGACT GGGTGCTCCT ACTCCCCT^A 
GAGCCCGCAA CACGCCGGGC CCCCATGGCC TCACCCCATA TGaStCTTA 
l^^f C CCCGCCCCT TGTAAACTTC CCTGACCCTG ACATGACAAG AGTTAC^AAC 
SSSSSS £TTACAGGCT CTCTACTTAG TCCAGCACGA SS 

GG ^ G ^ G ? CGG CA GCCTACCA AGAACAACTG GACCGACCGG TGGTACCTCA CCCTTACCGA 
G ! CGGG 2 ACA CAGTGTGGGT CCGCCGACAC CAGACTAAGA ACCTAGAACC TCGCTGGAAA 
SSS^S* CAGTCCTGC T GACCACCCCC ACCGCCCTCA AAGTAGACGG StcgSSS 
1^™%£G CCGCCCACGT GAAGGCTGCC GACCCCGGGG GTGGACCATC ScTAGACTC 
^ CA ^ GGCG TTCAA CGCTC TCAAAACCCC TTAAAAATAA GGTTAACCCG CGAGGCcS? 
! AA T GGCGTT AATTCTTCTG ATGCTCAGAG GGGTCAGTAC TGCTTCGCCC GGCTCCAGTG 
CG ^ C ^^ CC ^C^CATG AAAACATTTA ACATTTCTCA ACAAGATCTA GAATTAgSg 
^ G * AGGGAC AGAGAAGATT ACAATGCTTT ATGAGGATAA TAAACATCAT GTGGGAGCG? 
r^™™- GAAAA CAGGA GAAATCATTT CGGCAGTACA TATTGAAGCG . TATAT\GGAC 
^^ C ™ GT TTGTGCAGAA GCCATTGCGA TTGGTAGTGC AGTTTCGAAT GGACAAAAGG 
r^r ACAC GATTGTAGG ? GTTAGACACC CTTATTCTGA CGAAGTAGAT AGAAG^ATTC 
TCCTTGTGGT ATGTGTAGGG AGTTGATTTC AGACTATGCA CCAGATTGTT 
l^ll^ AGAAATGAAT GGCAAGTTAG TCAAAACTAC GATTGAAGAA CTCMTCCAC 
TCAAATATAC CCGAAATTAA AAGTTTTACC ACCAAGCTTA TCGAATTC CTCATTCCAC 
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Figure 9. FBdelPASAF Sequence 1 

CATATGCGGT GTGAAATACC GCACAGATGC GTAAGGAGAA AATACCGCAT CAGGCGCCAT 60 

TCGCCATTCA GGCTGCGCAA CTGTTGGGAA GGGCGATCGG TGCGGGCCTC TTCGCTATTA 120 

CGCCAGCTGG CGAAAGGGGG ATGTGCTGCA AGGCGATTAA GTTGGGTAAC GCCAGGGTTT 180 

TCCCAGTCAC GACGTTGTAA AACGACGGCC AGTGAATTCC GATTAGTTCA ATTTGTTAAA 240 

GACAGGATCT CAGTAGTCCA GGCTTTAGTC CTGACTCAAC AATACCACCA GCTAAAACCA 300 

CTAGAATACG AGCCACAATA AATAAAAGAT TTTATTTAGT TTCCAGAAAA AGGGGGGAA^ 3 60 

GAAAGACCCC ACCAAATTGC TTAGCCTGAT AGCCGCAGTA ACGCCATTTT GCAAGGCATG 420 

GAAAAATACC AAACCAAGAA TAGAGAAGTT CAGATCAAGG GCGGGTACAC GAAAACAGCT 480 

AACGTTGGGC CAAACAGGAT ATCTGCGGTG AGCAGTTTCG GCCCCGGCCC GGGGCCAAGA 540 

ACAGATGGTC ACCGCGGTTC GGCCCCGGCC CGGGGCC AAG AACAGATGGT CCCCAGATAT 600 

GGCCCAACCC TCAGCAGTTT CTTAAGACCC ATCAGATGTT TCCAGGCTCC CCCAAGGACC 660 

TGAAATGACC CTGTGCCTTA TTTGAATTAA CCAATCAGCC TGCTTCTCGC TTCTGTTCGC 720 

GCGCTTCTGC TTCCCGAGCT CTATAAAAGA GCTCACAACC CCTCACTCGG CGCGCCAGTC 780 

CTCCGATAGA CTGAGTCGCC CGGGTACCCG TGTATCCAAT AAATCCTCTT GCTGTTGCAT 840 

CCGACTCGTG GTCTCGCTGT TCCTTGGGAG GGTCTCCTCA GAGTGATTGA CTACCCGTCT 900 

CGGGGGTCTT TCATTTGGGG GCTCGTCCGG GATCTGGAGA CCCCTGCCCA GGGACCACCG 960 

ACCCACCACC GGGAGGTAAG CTGGCCAAGA TCTTATATGG GGCACCCCCG CCCCTTGTAA 1020 

ACTTCCCTGA CCCTGACATG ACCAGAGTTA CTAACAGCCC CTCTCTCCAA GCTCACTTAC 1080 

AGGCTCTCTA CTTAGTCCAG CACGAAGTTT GGAGACCACT GGCGGCAGCT TACCAAGAAC 1140 

AACTGGACCG GCCGGTGGTG CCTCACCCTT ACCGGGTCGG CGACACAGTG TGGGTCCGCC 1200 

GACATCAAAC CAAGAACCTA GAACCTCGCT GGAAAGGACC TTACACAGTC CTGCTGACCA 1260 

CCCCCACCGC CCTCAAAGTA GACGGTATCG CAGCTTGGAT ACACGCAGCC CACGTAAAGG 1320 

CGGCCGACAC CGAGAGTGGA CCATCCTCTG GACGGACATG GCGCGTTCAA CGCTCTCAAA 1380 

ACCCCCTCAA GATAAGATTA ACCCGTGGAA GCCCTTAATA GTCATGGGAG TCCTGTTAGG 1440 

AGTAGGGATG GCAGAGAGCC CCCATCAGGT CTTTAATGTA ACCTGGAGAG TCACCAACCT 1500 

GATGACTGGG CGTACCGCCA ATGCCACCTC CCTCCTGGGA ACTGTACAAG ATGCCTTCCC 1560 

AAAATTATAT TTTGATCTAT GTGATCTGGT CGGAGAGGAG TGGGACCCTT CAGACCAGGA 162 0 

ACCGTATGTC GGGTATGGCT GCAAGTACCC CGCAGGGAGA CAGCGGACCC GGACTTTTGA 1680 

CTTTTACGTG TGCCCTGGGC ATACCGTAAA GTCGGGGTGT GGGGGACCAG GAGAGGGCTA 1740 

CTGTGGTAAA TGGGGGTGTG AAACCACCGG ACAGGCTTAC TGGAAGCCCA CATCATCGTG 1800 

GGACCTAATC TCCCTTAAGC GCGGTAACAC CCCCTGGGAC ACGGGATGCT CTAAAGTTGC 1860 

CTGTGGCCCC TGCTACGACC TCTCCAAAGT ATCCAATTCC TTCCAAGGGG CTACTCGAGG 1920 

GGGCAGATGC AACCCTCTAG TCCTAGAATT CACTGATGCA GGAAAAAAGG CTAACTGGGA 1980 

CGGGCCCAAA TCGTGGGGAC TGAGACTGTA CCGGACAGGA ACAGATCCTA TTACCATGTT 2040 

CTCCCTGACC CGGCAGGTCC TTAATGTGGG ACCCCGAGTC CCCATAGGGC CCAACCCAGT 2100 

ATTACCCGAC CAAAGACTCC CTTCCTCACC AATAGAGATT GTACCGGCTC CACAGCCACC 2160 

TAGCCCCCTC AATACCAGTT ACCCCCCTTC CACTACCAGT ACACCCTCAA CCTCCCCTAC 2220 

AAGTCCAAGT GTCCCACAGC CACCCCCAGG AACTGGAGAT AGACTACTAG CTCTAGTCAA 2280 

AGGAGCCTAT CAGGCGCTTA ACCTCACCAA TCCCGACAAG ACCCAAGAAT GTTGGCTGTG 2340 

CTTAGTGTCG GGACCTCCTT ATTACGAAGG AGTAGCGGTC GTGGGCACTT ATACCAATCA 2400 

TTCCACCGCT CCGGCCAACT GTACGGCCAC TTCCCAACAT AAGCTTACCC TATCTGAAGT 2460 

GACAGGACAG GGCCTATGCA TGGGGGCAGT ACCTAAAACT CACCAGGCCT TATGTAACAC 2 520 

CACCCAAAGC GCCGGCTCAG GATCCTACTA CCTTGCAGCA CCCGCCGGAA CAATGTGGGC 2 580 

TTGCAGCACT GGATTGACTC CCTGCTTGTC CACCACGGTG CTCAATCTAA CCACAGATTA 2 640 

TTGTGTATTA GTTGAACTCT GGCCCAGAGT AATTTACCAC TCCCCCGATT ATATGTATGG 2700 

TCAGCTTGAA CAGCGTACCA AATATAAAAG AGAGCCAGTA TCATTGACCC TGGCCCTTCT 2760 

ACTAGGAGGA TTAACCATGG GAGGGATTGC AGCTGGAATA GGGACGGGGA CCACTGCCTT 2820 

AATTAAAACC CAGCAGTTTG AGCAGCTTCA TGCCGCTATC CAGACAGACC TCAACGAAGT 2 880 

CGAAAAGTCA ATTACCAACC TAGAAAAGTC ACTGACCTCG TTGTCTGAAG TAGTCCTACA 2 940 

GAACCGCAGA GGCCTAGATT TGCTATTCCT AAAGGAGGGA GGTCTCTGCG CAGCCCTAAA 3000 

AGAAGAATGT TGTTTTTATG CAGACCACAC GGGGCTAGTG AGAGACAGCA TGGCCAAATT 3 060 

AAGAGAAAGG CTTAATCAGA GACAAAAACT ATTTGAGACA GGCCAAGGAT GGTTCGAAGG 3120 

GCTGTTTAAT AGATCCCCCT GGTTTACCAC CTTAATCTCC ACCATCATGG GACCTCTAAT 3180 

AGTACTCTTA CTGATCTTAC TCTTTGGACC TTGCATTCTC AATCGATTAG TTCAATTTGT 3240 

TAAAGACAGG ATCTCAGTAG TCCAGGCTTT AGTCCTGACT CAACAATACC ACCAGCTAAA 3 300 

GCCTATAGAG TACGAGCCAT AGGGCGCCTA GTGTTGACAA TTAATCATCG GCATAGTATA 33 60 

CGGCATAGTA TAATACGACT CACTATAGGA GGGCCACCAT GGCCAAGTTG ACCAGTGCCG 342 0 

TTCCGGTGCT CACCGCGCGC GACGTCGCCG GAGCGGTCGA GTTCTGGACC GACCGGCTCG 3 480 

GGTTCTCCCG GGACTTCGTG GAGGACGACT TCGCCGGTGT GGTCCGGGAC GACGTGACCC 3 540 

TGTTCATCAG CGCGGTCCAG G AC C AGGTGG TGCCGGACAA CACCCTGGCC TGGGTGTGGG 3600 

TGCGCGGCCT GGACGAGCTG TACGCCGAGT GGTCGGAGGT CGTGTCCACG AACTTCCGGG 3 660 

ACGCCTCCGG GCCGGCCATG ACCGAGATCG GCGAGCAGCC GTGGGGGCGG GAGTTCGCCC 3720 

TGCGCGACCC GGCCGGCAAC TGCGTGCACT TCGTGGCCGA GGAGCAGGAC TGANNNNCGG 3780 

ACCGGTCGAC TTGTTAACTT GTTTATTGCA GCTTATAATG GTTACAAATA AAGCAATAGC 3 840 

ATCACAAATT TCACAAATAA AGCATTTTTT TCACTGCATT CTAGTTGTGG TTTGTCCAAA 3900 

CTCATCAATG TATCTTATCA TGTCTGGATC CAGATCTGGG CCCATGCGGC CGCGGATCGA 3 960 

TNNNNACATG TGAGCAAAAG GCCAGCAAAA GGCCAGGAAC CGTAAAAAGG CCGCGTTGCT 402 0 

GGCGTTTTTC CATAGGCTCC GCCCCCCTGA CGAGCATCAC AAAAATCGAC GCTCAAGTCA 4080 



WO 97/08330 
Figure 9. FBdelPASAF Sequence 



15/22 
2 



PCT/GB96/02061 



GAGGTGGCGA AACCCGACAG GACTATAAAG ATACCAGGCG TTTCCCCCTG GAAGCTCCCT 4140 
CGTGCGCTCT CCTGTTCCGA CCCTGCCGCT TACCGGATAC CTGTCCGCCT TTCTCCCTTC " 4200 

GGGAAGCGTG GCGCTTTCTC AATGCTCACG CTGTAGGTAT CTCAGTTCGG TGTAGGTCGT 4260 

TCGCTCCAAG CTGGGCTGTG TGCACGAACC CCCCGTTCAG CCCGACCGCT GCGCCTTATC 4320 

CGGTAACTAT CGTCTTGAGT CCAACCCGGT AAGACACGAC TTATCGCCAC TGGCAGCAGC 4380 

CACTGGTAAC AGGATTAGCA GAGCGAGGTA TGTAGGCGGT GCTACAGAGT TCTTGAAGTG 4440 

GTGGCCTAAC TACGGCTACA CTAGAAGGAC AGTATTTGGT ATCTGCGCTC TGCTGAAGCC 4500 

AGTTACCTTC GGAAAAAGAG TTGGTAGCTC TTGATCCGGC AAACAAACCA CCGCTGGTAG 4560 

CGGTGGTTTT TTTGTTTGCA AGCAGCAGAT TACGCGCAGA AAAAAAGGAT CTCAAGAAGA 4690 

TCCTTTGATC TTTTCTACGG GGTCTGACGC TCAGTGGAAC GAAAACTCAC GTTAAGGGAT 4ggn 

TTTGGTCATG AGATTATCAA AAAGGATCTT CACCTAGATC CTTTTAAATT AAAAATGAAG 4740 

TTTTAAATCA ATCTAAAGTA TATATGAGTA AACTTGGTCT GACAGTTACC AATGCTTAAT 4800 

CAGTGAGGCA CCTATCTCAG CGATCTGTCT ATTTCGTTCA TCCATAGTTG CCTGACTCCC 4 8 SO 

CGTCGTGTAG ATAACTACGA TACGGGAGGG CTTACCATCT GGCCCCAGTG CTGCAATGAT 4920 

ACCGCGAGAC CCACGCTCAC CGGCTCCAGA TTTATCAGCA ATAAACCAGC CAGCCGGAAG 49 80 

GGCCGAGCGC AGAAGTGGTC CTGCAACTTT ATCCGCCTCC ATCCAGTCTA TTAATTGTTG 5040 

CCGGGAAGCT AGAGTAAGTA GTTCGCCAGT TAATAGTTTG CGCAACGTTG TTGCCATTGC 5100 

TACAGGCATC GTGGTGTCAC GCTCGTCGTT TGGTATGGCT TCATTCAGCT CCGGTTCCCA 5160 

ACGATCAAGG CGAGTTACAT GATCCCCCAT GTTGTGCAAA AAAGCGGTTA GCTCCTTCGG 5??0 

TCCTCCGATC GTTGTCAGAA GTAAGTTGGC CGCAGTGTTA TCACTCATGG TTATGGCAGC 5280 

ACTGCATAAT TCTCTTACTG TCATGCCATC CGTAAGATGC TTTTCTGTGA CTGGTGAGTA 5340 

CTCAACCAAG TCATTCTGAG AATAGTGTAT GCGGCGACCG AGTTGCTCTT GCCCGGCGTC 5400 

AATACGGGAT AATACCGCGC CACATAGCAG AACTTTAAAA GTGCTCATCA TTGGAAAACG 546 0 

TTCTTCGGGG CGAAAACTCT CAAGGATCTT ACCGCTGTTG AGATCCAGTT CGATGTAACC 552 0 

CACTCGTGCA CCCAACTGAT CTTCAGCATC TTTTACTTTC ACCAGCGTTT CTGGGTGAGC 5580 

AAAAACAGGA AGGCAAAATG CCGCAAAAAA GGGAATAAGG GCGACACGGA AATGTTGAAT 5640 

ACTCATACTC TTCCTTTTTC AATATTATTG AAGCATTTAT CAGGGTTATT GTCTCATGAG 5700 

CGGATACATA TTTGAATGTA TTTAGAAAAA TAAACAAATA GGGGTTCCGC GCACATTTCC 5760 

CCGAAAAGTG CCACCTGACG TCTAAGAAAC CATTATTATC ATGACATTAA CCTATAAAAA 582 0 

TAGGCGTATC ACGAGGCCCT TTCGTCTCGC GCGTTTCGGT GATGACGGTG AAAACCTCTG 5880 

ACACATGCAG CTCCCGGAGA CGGTCACAGC TTGTCTGTAA GCGGATGCCG GGAGCAGACA 5940 

AGCCCGTCAG GGCGCGTCAG CGGGTGTTGG CGGGTGTCGG GGCTGGCTTA ACTATGCGGC 60 0 0 
ATCAGAGCAG ATTGTACTGA GAGTGCAC 
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CATATGCGGT GTGAAATACC GCACAGATGC GTAAGGAGAA AATACCGCAT CAGGCGCCAT 60 
TCGCCATTCA GGCTGCGCAA CTGTTGGGAA GGGCGATCGG TGCGGGCCTC TTCGCTATTA ' 12 0 

CGCCAGCTGG CGAAAGGGGG ATGTGCTGCA AGGCGATTAA GTTGGGTAAC GCCAGGGTTT 180 

TCCCAGTCAC GACGTTGTAA AACGACGGCC AGTGAATTCC GATTAGTTCA ATTTGTTAAA 240 

GACAGGATCT CAGTAGTCCA GGCTTTAGTC CTGACTCAAC AATACCACCA GCTAAAACCA 3 00 

CTAGAATACG AGCCACAATA AATAAAAGAT TTTATTTAGT TTCCAGAAAA AGGGGGGAAT 3 60 

GAAAGACCCC ACCAAATTGC TTAGCCTGAT AGCCGCAGTA ACGCCATTTT GCAAGGCATG 420 

GAAAAATACC AAACCAAGAA TAGAGAAGTT CAGATCAAGG GCGGGTACAC GAAAACAGCT 430 

AACGTTGGGC CAAACAGGAT ATCTGCGGTG AGCAGTTTCG GCCCCGGCCC GGGGCCAAGA 540 

ACAGATGGTC ACCGCGGTTC GGCCCCGGCC CGGGGCCAAG AACAGATGGT CCCCAGATAT 600 

GGCCCAACCC TCAGCAGTTT CTTAAGACCC ATCAGATGTT TCCAGGCTCC CCCAAGGACC 6 60 

TGAAATGACC CTGTGCCTTA TTTGAATTAA GCAATCAGCC TGCTTCTCGC TTCTGTTCGC 720 

GCGCTTCTGC TTCCCGAGCT CTATAAAAGA GCTCACAACC CCTCACTCGG CGCGCCAGTC 78 0 

CTCCGATAGA CTGAGTCGCC CGGGTACCCG TGTATCCAAT AAATCCTCTT GCTGTTGCAT 840 

CCGACTCGTG GTCTCGCTGT TCCTTGGGAG GGTCTCCTCA GAGTGATTGA CTACCCGTCT 900 

CGGGGGTCTT TCATTTGGGG GCTCGTCCGG GATCTGGAGA CCCCTGCCCA GGGACCACCG 960 

ACCCACCACC GGGAGGTAAG CTGGCCAAGA TCTTATATGG GGCACCCCCG CCCCTTGTAA 102 0 

ACTTCCCTGA CCCTGACATG ACAAGAGTTA CTAACAGCCC CTCTCTCCAA GCTCACTTAC 1080 

AGGCTCTCTA CTTAGTCCAG CACGAAGTCT GGAGACCTCT GGCGGCAGCC TACCAAGAAC 1140 

AACTGGACCG ACCGGTGGTA CCTCACCCTT ACCGAGTCGG CGACACAGTG TGGGTCCGCC 1200 

GACACCAGAC TAAGAACCTA GAACCTCGCT GGAAAGGACC TTACACAGTC CTGCTGACCA "12 60 

CCCCCACCGC CCTCAAAGTA GACGGCATCG CAGCTTGGAT ACACGCCGCC CACGTGAAGG 1320 

CTGCCGACCC CGGGGGTGGA CCATCCTCTA GACTGACATG GCGCGTTCAA CGCTCTCAAA 13 80 

ACCCCTTAAA AATAAGGTTA ACCCGCGAGG CCCCCTAATC CCCTTAATTC TTCTGATGCT 1440 

CAGAGGGGTC AGTACTGCTT CGCCCGGCTC CAGTCCTCAT CAAGTCTATA ATATCACCTG 1500 

GGAGGT AAC C AATGGAGATC GGGAGACGGT ATGGGCAACT TCTGGCAACC ACCCTCTGTG 15 60 

GACCTGGTGG CCTGACCTTA CCCCAGATTT ATGTATGTTA GCCCACCATG GACCATCTTA 1620 

TTGGGGGCTA GAATATCAAT CCCCTTTTTC TTCTCCCCCG GGGCCCCCTT GTTGCTCAGG 1680 

GGGCAGCAGC CCAGGCTGTT CCAGAGACTG CGAAGAACCT TTAACCTCCC TCACCCCTCG 1740 

GTGCAACACT GCCTGGAACA GACTCAAGCT AGACCAGACA ACTCATAAAT CAAATGAGGG 1800 

ATTTTATGTT TGCCCCGGGC CCCACCGCCC CCGAGAATCC AAGTCATGTG GGGGTCCAGA 18 60 

CTCCTTCTAC TGTGCCTATT GGGGCTGTGA GACAACCGGT AGAGCTTACT GGAAGCCCTC 192 0 

CTCATCATGG GATTTCATCA CAGTAAACAA CAATCTCACC TCTGACCAGG CTGTCCAGGT 1980 

ATGCAAAGAT AATAAGTGGT GCAACCCCTT AGTTATTCGG TTTACAGACG CCGGGAGACG 2040 

GGTTACTTCC TGGACCACAG GACATTACTG GGGCTTACGT TTGTATGTCT CCGGACAAGA 2100 

TCCAGGGCTT ACATTTGGGA TCCGACTCAG ATACCAAAAT CTAGGACCCC GCGTCCCAAT 2160 

AGGGCCAAAC CCCGTTCTGG CAGACCAACA GCCACTCTCC AAGCCCAAAC CTGTTAAGTC 222 0 

GCCTTCAGTC ACCAAACCAC CCAGTGGGAC TCCTCTCTCC CCTACCCAAC TTCCACCGGC 22 80 

GGGAACGGAA AATAGGCTGC TAAACTTAGT AGACGGAGCC TACCAAGCCC TCAACCTCAC 23 40 

CAGTCCTGAC AAAACCCAAG AGTGCTGGTT GTGTCTAGTA GCGGGACCCC CCTACTACGA 2400 

AGGGGTTGCC GTCCTGGGTA CCTACTCCAA CCATACCTCT GCTCCAGCCA ACTGCTCCGT 2 4 60 

GGCCTCCCAA CACAAGTTGA CCCTGTCCGA AGTGACCGGA CAGGGACTCT GCATAGGAGC 2520 

AGTTCCCAAA ACACATCAGG CCCTATGTAA TACCACCCAG ACAAGCAGTC GAGGGTCCTA 2580 

TTATCTAGTT GCCCCTACAG GTACCATGTG GGCTTGTAGT ACCGGGCTTA CTCCATGCAT 2 540 

CTCCACCACC ATACTGAACC TTACCACTGA TTATTGTGTT CTTGTCGAAC TCTGGCCAAG 2700 

AGTCACCTAT CATTCCCCCA GCTATGTTTA CGGCCTGTTT GAGAGATCCA ACCGACACAA 27 60 

AAGAGAACCG GTGTCGTTAA CCCTGGCCCT ATTATTGGGT GGACTAACCA TGGGGGGAAT 2820 

TGCCGCTGGA ATAGGAACAG GGACTACTGC TCTAATGGCC ACTCAGCAAT TCCAGCAGCT 2880 

CCAAGCCGCA GTACAGGATG ATCTCAGGGA GGTTGAAAAA TCAATCTCTA ACCTAGAAAA 2940 

GTCTCTCACT TCCCTGTCTG AAGTTGTCCT ACAGAATCGA AGGGGCCTAG ACTTGTTATT 3000 

TCTAAAAGAA GGAGGGCTGT GTGCTGCTCT AAAAGAAGAA TGTTGCTTCT ATGCGGACCA 3060 

CACAGGACTA GTGAGAGACA GCATGGCCAA ATTGAGAGAG AGGCTTAATC AGAGACAGAA 3120 

ACTGTTTGAG TCAACTCAAG GATGGTTTGA GGGACTGTTT AACAGATCCC CTTGGTTTAC 3180 

CACCTTGATA TCTACCATTA . TGGGACCCCT CATTGTACTC CTAATGATTT TGCTCTTCGG 3240 

ACCCTGCATT CTTAATCGAT TAGTTCAATT TGTTAAAGAC AGGATCTCAG TAGTCCAGGC 3300 

TTTAGTCCTG ACTCAACAAT ACCACCAGCT AAAGCCTATA GAGTACGAGC CATAGGGCGC 33 60 

CTAGTGTTGA CAATTAATCA TCGGCATAGT ATACGGCATA GTATAATACG ACTCACTATA 3420 

GGAGGGCCAC CATGGCCAAG TTGACCAGTG CCGTTCCGGT GCTCACCGCG CGCGACGTCG 3480 

CCGGAGCGGT CGAGTTCTGG ACCGACCGGC TCGGGTTCTC CCGGGACTTC GTGGAGGACG 3540 

ACTTCGCCGG TGTGGTCCGG GACGACGTGA CCCTGTTCAT CAGCGCGGTC CAGGACCAGG 3 600 

TGGTGCCGGA CAACACCCTG GCCTGGGTGT GGGTGCGCGG CCTGGACGAG CTGTACGCCG 3660 

AGTGGTCGGA GGTCGTGTCC ACGAACTTCC GGG ACGCCTC CGGGCCGGCC ATGACCGAGA 3720 

TCGGCGAGCA GCCGTGGGGG CGGGAGTTCG CCCTGCGCGA CCCGGCCGGC AACTGCGTGC 3780 

ACTTCGTGGC CGAGGAGCAG GACTGANNNN CGGACCGGTC GACTTGTTAA CTTGTTTATT 3840 

GCAGCTTATA ATGGTTACAA ATAAAGCAAT AGCATCACAA ATTTCACAAA TAAAGCATTT 3900 

TTTTCACTGC ATTCTAGTTG TGGTTTGTCC AAACTCATCA ATGTATCTTA TCATGTCTGG 3960 

ATCCAGATCT GGGCCCATGC GGCCGCGGAT CGATNNNNAC ATGTGAGCAA AAGGCCAGCA 4020 

AAAGGCCAGG AACCGTAAAA AGGCCGCGTT GCTGGCGTTT TTCCATAGGC TCCGCCCCCC 4080 
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SScSg GCG^C CAGGACTA ^ - 4140 

GCTTACCGGA TACCTGTCCG cStcTCC? ™S22S£ J~rrrrr^ CGACCC TGCC 4200 

ACGCTGTAGG TATCTCaSS SSSS 4260 

ACCCCCCGTT CAGCCCGACC GCTGCGCCTT ATCCGgSS SScgS^G Irvr^rrr 4320 

GGTAAGACAC GACTTATCGC CACTGGCAGC AGCCACTGGT llcArcll£ ^^ AACCC 4380 

gtatgtaggc ggtgctacag agttcttcS JSSJSsSS JJSSSSJ SSSSSS 4440 

GACAGTATTT GGTATCTGCG CTCTGCTGAA GCCAGTTArr ^rriiZ*? 4 500 

CTCTTGATCC GGCAAACAAA CCACCGC^ ££££££ ESS^ J!" 

GATTACGCGC AGAAAAAAAG GATCTCAAGA AGATCCTTTG ItCTTOTC^ C^OTCTPA Mil 

^il^l^ AACGAAAACT CACGTTAAGG GATTTTGGTC ATGAGATT\T SKKSS JSfo 

CTTCACCTAG ATCCTTTTAA ATTAAAAATG AAGTTfTAAA TCAATCTAAA r^Vr^^T 4740 

^sss jggjsss sssss I™ ill |||§ 

™ iiiii mi liiii i° 

Eli HEE iill^ll IIP Illl is 

SrSSS 0 TTATCAc ™ SSSS? SSgS? aSSSS 5280 

ATCCGTAAGA TGCTTTTCTG TGACTGGTGA GTACTCAACC AAGTCaSS SSS^HSH 
TATGCGGCGA CCGAGTTGCT cttgcccggc gtcaa^acgg SSiJS 

CAGAACTTTA AAAGTGCTCA TCATTGGAAA ACGTTCTTCG rnrS.n SSSS* 0 *™ 0 

CTTACCGCTG TTGAGATCCA GTOCGATCTA ACCESS? c^rr^S TCTCAAGGAT 5520 

= ssss sss™ — HE HIE - 
£5= sasss ssss sssKs ™i IS 

^ATTATT ATCATGACAT TAACCTATAA AAATAGGCGT ATCACGAGGC CCTTTCGTCT 
f ^°SS C GCTGATGACG GTGAAAACCT CTGACACATG CAGCTCCCgS AGACGGTCAr 
TAAGCGGATG CCGGGAGCAG ACAAGCCCGT CAGGGCGCGT SSSSSSS 
TGGCGGGTGT CGGGGCTGGC TTAACTATGC GGCATCAGAG CAGATTGtS tcS^gS 
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CATATGCGGT GTGAAATACC GCACAGATGC GTAAGGAGAA AATACCGCAT CAGGCGCCAT 60 
TCGCCATTCA GGCTGCGCAA CTGTTGGGAA GGGCGATCGG TGCGGGCCTC TTCGCTATTA * 12 0 

CGCCAGCTGG CGAAAGGGGG ATGTGCTGCA AGGCGATTAA GTTGGGTAAC GCCAGGGTTT 180 

TCCCAGTCAC GACGTTGTAA AACGACGGCC AGTGAATTCC GATTAGTTCA ATTTGTTAAA 240 

GACAGGATCT CAGTAGTCCA GGCTTTAGTC CTGACTCAAC AATACCACCA GCTAAAACCA 300 

CTAGAATACG AGCCACAATA AATAAAAGAT TTTATTTAGT TTCCAGAAAA AGGGGGGAAT 3 60 

GAAAGACCCC ACCAAATTGC TTAGCCTGAT AGCCGCAGTA ACGCCATTTT GCAAGGC^TG 420 

GAAAAATACC AAACCAAGAA TAGAGAAGTT CAGATCAAGG GCGGGTACAC GAAAACAGCT 480 

AACGTTGGGC CAAACAGGAT ATCTGCGGTG AGCAGTTTCG GCCCCGGCCC GGGGCCAAGA 540 

ACAGATGGTC ACCGCGGTTC GGCCCCGGCC CGGGGCCAAG AACAGATGGT CCCCAGATAT 600 

GGCCCAACCC TCAGCAGTTT CTTAAGACCC ATCAGATGTT TCCAGGCTCC CCCAAGGACC 660 

TGAAATGACC CTGTGCCTTA TTTGAATTAA CCAATCAGCC TGCTTCTCGC TTCTGTTCGC 720 

GCGCTTCTGC TTCCCGAGCT CTATAAAAGA GCTCACAACC CCTCACTCGG CGCGCCAGTC 780 

CTCCGATAGA CTGAGTCGCC CGGGTACCCG TGTATCCAAT AAATCCTCTT GCTGTTGCAT 840 

CCGACTCGTG GTCTCGCTGT TCCTTGGGAG GGTCTCCTCA GAGTGATTGA CTACCCGTCT 900 

CGGGGGTCTT TCATTTGGGG GCTCGTCCGG GATCTGGAGA CCCCTGCCCA GGGACCACCG 9 60 

ACCCACCACC GGGAGGTAAG CTGGCCAAGA TCCCTAAGGT ACTCGGGTCA GACAATGGCC 102 0 

CGGCCTTTGT TGCTCAGGTA AGTCAGGGAC TGGCCACTCA ACTGGGGATA AATTGGAAGT 1080 

TACATTGTGC GTATAGACCC CAGAGCTCAG GTCAGGTAGA AAGAATGAAC AGAACAATTA 1140 

AAGAGACCTT GACCAAATTA GCCTTAGAGA CCGGTGGAAA AGACTGGGTG ACCCTCCTTC 12 00 

CCTTAGCGCT GCTTAGGGCC AGGAATACCC CTGGCCGGTT TGGTTTAACT CCTTATGAAA 12 60 

TTCTCTATGG AGGACCACCC CCCATACTTG AGTCTGGAGA AACTTTGGGT CCCGATGATA 1320 

GATTTCTCCC TGTCTTATTT ACTCACTTAA AGGCTTTAGA AATTGTAAGG ACCCAAATCT 13 80 

GGGACCAGAT CAAAGAGGTG TATAAGCCTG GTACCGTAAC AATCCCTCAC CCGTTCCAGG 1440 

TCGGGGATCA AGTGCTTGTC AGACGCCATC GACCCAGCAG CCTTGAGCCT CGGTGGAAAG 1500 

GCCCATACCT GGTGTTGCTG ACTACCCCGA CCGCGGTAAA AGTCGATGGT ATTGCTGCCT 15 60 

GGGTCCATGC TTCTCACCTC AAACCTGCAC CACCTTCGGC ACCAGATGAG TCCTGGGAGC 1620 

TGGAAAAGAC TGATCATCCT CTTAAGCTGC GTATTCGGCG GCGGCGGGAC GAGTCTGCAA 16 80 

AATAAGAACC CCCACCAGCC CATGACCCTC ACTTGGCAGG TACTGTCCCA AACTGGAGAC 174 0 

GTTGTCTGGG ATACAAAGGC AGTCCAGCCC CCTTGGACTT GGTGGCCCAC ACTTAAACCT 1800 

GATGTATGTG CCTTGGCGGC TAGTCTTGAG TCCTGGGATA TCCCGGGAAC CGATGTCTCG 18 60 

TCCTCTAAAC GAGTCAGACC TCCGGACTCA GACTATACTG CCGCTTATAA GCAAATCACC 1920 

TGGGGAGCCA TAGGGTGCAG CTACCCTCGG GCTAGGACTA GAATGGCAAG CTCTACCTTC 1980 

TACGTATGTC CCCGGGATGG CCGGACCCTT TCAGAAGCTA GAAGGTGCGG GGGGCTAGAA 2040 

TCCCTATACT GTAAAGAATG GGATTGTGAG ACCACGGGGA CCGGTTATTG GCTATCTAAA 2100 

TCCTCAAAAG ACCTCATAAC TGTAAAATGG GACCAAAATA GCGAATGGAC TCAAAAATTT 2160 

CAACAGTGTC ACCAGACCGG CTGGTGTAAC CCCCTTAAAA TAGATTTCAC AGACAAAGGA 2220 

AAATTATCCA AGGACTGGAT AACGGGAAAA ACCTGGGGAT TAAGATTCTA TGTGTCTGGA 22 80 

CATCCAGGCG TACAGTTCAC CATTCGCTTA AAAATCACCA ACATGCCAGC TGTGGCAGTA 23 40 

GGTCCTGACC TCGTCCTTGT GGAACAAGGA CCTCCTAGAA CGTCCCTCGC TCTCCCACCT 2400 

CCTCTTCCCC CAAGGGAAGC GCCACCGCCA TCTCTCCCCG ACTCTAACTC CACAGCCCTG 24 60 

GCGACTAGTG CACAAACTCC CACGGTGAGA AAAACAATTG TTACCCTAAA CACTCCGCCT 2520 

CCCACCACAG GCGACAGACT TTTTGATCTT GTGCAGGGGG CCTTCCTAAC CTTAAATGCT 2580 

ACCAACCCAG GGGCCACTGA GTCTTGCTGG CTTTGTTTGG CCATGGGCCC CCCTTATTAT 2640 

GAAGCAATAG CCTCATCAGG AGAGGTCGCC TACTCCACCG ACCTTGACCG GTGCCGCTGG 2700 

GGGACCCAAG GAAAGCTCAC CCTCACTGAG GTCTCAGGAC ACGGGTTGTG CATAGGAAAG 27 60 

GTGCCCTTTA CCCATCAGCA TCTCTGCAAT CAGACCCTAT CCATCAATTC CTCCGGAGAC 2820 

CATCAGTATC TGCTCCCCTC CAACCATAGC TGGTGGGCTT GCAGCACTGG CCTCACCCCT 28 80 

TGCCTCTCCA CCTCAGTTTT TAATCAGACT AGAGATTTCT GTATCCAGGT CCAGCTGATT 2940 

CCTCGCATCT ATTACTATCC TGAAGAAGTT TTGTTACAGG CCTATGACAA TTCTCACCCC 3000 

AGGACTAAAA GAGAGGCTGT CTCACTTACC CTAGCTGTTT TACTGGGGTT GGGAATCACG 3060 

GCGGGAATAG GTACTGGTTC AACTGCCTTA ATTAAAGGAC CTATAGACCT CCAGCAAGGC 3120 

CTGACAAGCC TCCAGATCGC CATAGATGCT GACCTCCGGG CCCTCCAAGA CTCAGTCAGC 3180 

AAGTTAGAGG ACTCACTGAC TTCCCTGTCC GAGGTAGTGC TCCAAAATAG GAGAGGCCTT 3240 

GACTTGCTGT TTCTAAAAGA AGGTGGCCTC TGTGCGGCCC TAAAGGAAGA GTGCTGTTTT 33 00 

TACATAGACC ACTCAGGTGC AGTACGGGAC TCCATGAAAA AACTCAAAGA AAAACTGGAT 33 6 0 

AAAAGACAGT TAGAGCGCCA GAAAAGCCAA AACTGGTATG AAGGATGGTT CAATAACTCC 342 0 

CCTTGGTTCA CTACCCTGCT ATCAACCATC GCTGGGCCCC TATTACTCCT CCTTCTGTTG 3480 

CTCATCCTCG GGCCATGCAT CATCAATCGA TTAGTTCAAT TTGTTAAAGA CAGGATCTCA 3540 

GTAGTCCAGG CTTTAGTCCT GACTCAACAA TACCACCAGC TAAAGCCTAT AGAGTACGAG 3 600 

CCATAGGGCG CCTAGTGTTG ACAATTAATC ATCGGCATAG TATACGGCAT AGTATAATAC 3 6 60 

GACTCACTAT AGGAGGGCCA CCATGGCCAA GTTGACCAGT GCCGTTCCGG TGCTCACCGC 3720 

GCGCGACGTC GCCGGAGCGG TCGAGTTCTG GACCGACCGG CTCGGGTTCT CCCGGGACTT 3780 

CGTGGAGGAC GACTTCGCCG GTGTGGTCCG GGACGACGTG ACCCTGTTCA TCAGCGCGGT 3 840 

CCAGGACCAG GTGGTGCCGG ACAACACCCT GGCCTGGGTG TGGGTGCGCG GCCTGGACGA 3900 

GCTGTACGCC GAGTGGTCGG AGGTCGTGTC CACGAACTTC CGGGACGCCT CCGGGCCGGC 3 960 

CATGACCGAG ATCGGCGAGC AGCCGTGGGG GCGGGAGTTC GCCCTGCGCG ACCCGGCCGG 402 0 

CAACTGCGTG CACTTCGTGG CCGAGGAGCA GGACTGANNN NCGGACCGGT CGACTTGTTA 4080 
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ACTTGTTTAT TGCAGCTTAT AATGGTTACA AATAAAGCAA TAGCATCACA AATTTCACAA 
ATAAAGCATT TTTTTCACTG CATTCTAGTT GTGGTTTGTC CAAACTCATC AATGTATCTT ' 420 0 

ATCATGTCTG GATCCAGATC TGGGCCCATG CGGCCGCGGA TCGATNNNNA CATGTGAGCA 4260 

AAAGGCCAGC AAAAGGCCAG GAACCGTAAA AAGGCCGCGT TGCTGGCGTT TTTCCATAGG Hr>n 

CTCCGCCCCC CTGACGAGCA TCACAAAAAT CGACGCTCAA GTCAGAGGTG GCGAAACCCG I^no 

ACAGGACTAT AAAGATACCA GGCGTTTCCC CCTGGAAGCT CCCTCGTGCG CTCTCCTGTT 4 44 0 

CCGACCCTGC CGCTTACCGG ATACCTGTCC GCCTTTCTCC CTTCGGGAAG CGTGGCGCT^ 4500 

TCTCAATGCT CACGCTGTAG GTATCTCAGT TCGGTGTAGG TCGTTCGCTC CAAGCTGGGC 4S60 

TGTGTGCACG AACCCCCCGT TCAGCCCGAC CGCTGCGCCT TATCCGGTAA CTATCGTC«T 4 62 0 

GAGTCCAACC CGGTAAGACA CGACTTATCG CCACTGGCAG CAGCCACTGG TAACAGGA^T 4680 

AGCAGAGCGA GGTATGTAGG CGGTGCTACA GAGTTCTTGA AGTGGTGGCC TAACTA.CGGC 4740 

TACACTAGAA GGACAGTATT TGGTATCTGC GCTCTGCTGA AGCCAGTTAC CTTCGGAAAA 4 800 

AGAGTTGGTA GCTCTTGATC CGGCAAACAA ACCACCGCTG GTAGCGGTGG TTTTTTTGT" 1 4860 

TGCAAGCAGC AGATTACGCG CAGAAAAAAA GGATCTCAAG AAGATCCTTT GATCTTTTC^ 4 92 0 

ACGGGGTCTG ACGCTCAGTG GAACGAAAAC TCACGTTAAG GGATTTTGGT CATGAGATTA 4980 

TCAAAAAGGA TCTTCACCTA GATCCTTTTA AATTAAAAAT GAAGTTTTAA ATCAATCTAA 5 040 

AGTATATATG AGTAAACTTG GTCTGACAGT TACCAATGCT TAATCAGTGA GGCACCTATC 5100 

TCAGCGATCT GTCTATTTCG TTCATCCATA GTTGCCTGAC TCCCCGTCGT GTAGATAACT 5160 

ACGATACGGG AGGGCTTACC ATCTGGCCCC AGTGCTGCAA TGATACCGCG AGACCCACGC 5220 

TCACCGGCTC CAGATTTATC AGCAATAAAC CAGCCAGCCG GAAGGGCCGA GCGCAGAAGT 5280 

GGTCCTGCAA CTTTATCCGC CTCCATCCAG TCTATTAATT GTTGCCGGGA AGCTAGAGTA 5340 

AGTAGTTCGC CAGTTAATAG TTTGCGCAAC GTTGTTGCCA TTGCTACAGG CATCGTGGTG 5400 

T^GCTCGT CGTTT GGTAT GGCTTCATTC AGCTCCGGTT CCCAACGATC AAGGCGAGTT 5460 

ACATGATCCC CCATGTTGTG CAAAAAAGCG GTTAGCTCCT TCGGTCCTCC GATCGTTGTC 5 52 0 

AGAAGTAAGT TGGCCGCAGT GTTATCACTC ATGGTTATGG CAGCACTGCA TAATTCTCTT' 5580 

ACTGTCATGC CATCCGTAAG ATGCTTTTCT GTGACTGGTG AGTACTCAAC CAAGTCATTC 5640 

TGAGAATAGT GTATGCGGCG ACCGAGTTGC TCTTGCCCGG CGTCAATACG GGATAATACC 5700 

GCGCCACATA GCAGAACTTT AAAAGTGCTC ATCATTGGAA AACGTTCTTC GGGGCGAAAA S760 

CTCTCAAGGA TCTTACCGCT GTTGAGATCC AGTTCGATGT AACCCACTCG TGCACCCAAC 582 0 

T^r CAG CATCTTT TAC TTTCACCAGC GTTTCTGGGT GAGCAAAAAC AGGAAGGCAA 5880 

AATGCCGCAA AAAAGGGAAT AAGGGCGACA CGGAAATGTT GAATACTCAT ACTCTTCCTT 5940 

TTTCAATATT ATTGAAGCAT TTATCAGGGT TATTGTCTCA TGAGCGGATA CATATTTGAA 6000 

TGTATTTAGA AAAATAAACA AATAGGGGTT CCGCGCACAT TTCCCCGAAA AGTGCCACCT 6060 

GACGTCTAAG AAACCATTAT TATCATGACA TTAACCTATA AAAATAGGCG TATCACGAGG 6120 

CCCTTTCGTC TCGCGCGTTT CGGTGATGAC GGTGAAAACC TCTGACACAT GCAGCTCCCG 6180 

GAGACGGTCA CAGCTTGTCT GTAAGCGGAT GCCGGGAGCA GACAAGCCCG TCAGGGCGCG 6240 

TCAGCGGGTG TTGGCGGGTG TCGGGGCTGG CTTAACTATG CGGCATCAGA GCAGATTGTA 6300 
CTGAGAGTGC AC 
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CATATGCGGT GTGAAATACC GCACAGATGC GTAAGGAGAA AATACCGCAT CAGGCGCCAT 60 
TCGCCATTCA GGCTGCGCAA CTGTTGGGAA GGGCGATCGG TGCGGGCCTC TTCGCTATTA " 120 

CGCCAGCTGG CGAAAGGGGG ATGTGCTGCA AGGCGATTAA GTTGGGTAAC GCCAGGGTTT 180 

TCCCAGTCAC GACGTTGTAA AACGACGGCC AGTGAATTCC GATTAGTTCA ATTTGTTAAA 2 40 

GACAGGATCT CAGTAGTCCA GGCTTTAGTC CTGACTCAAC AATACCACCA GCTAAAACCA 3 00 

CTAGAATACG AGCCACAATA AATAAAAGAT TTTATTTAGT TTCCAGAAAA AGGGGGGAAT 3 60 

GAAAGACCCC ACCAAATTGC TTAGCCTGAT AGCCGCAGTA ACGCCATTTT GCAAGGCATG 420 

GAAAAATACC AAACCAAGAA TAGAGAAGTT CAGATCAAGG GCGGGTACAC GAAAACAGCT 480 

AACGTTGGGC CAAACAGGAT ATCTGCGGTG AGCAGTTTCG GCCCCGGCCC GGGGCCAAGA 540 

ACAGATGGTC ACCGCGGTTC GGCCCCGGCC CGGGGCCAAG AACAGATGGT CCCCAGATAT 600 

GGCCCAACCC TCAGCAGTTT CTTAAGACCC ATCAGATGTT TCCAGGCTCC CCCAAGGACC 660 

TGAAATGACC CTGTGCCTTA TTTGAATTAA CCAATCAGCC TGCTTCTCGC TTCTGTTCGC 720 

GCGCTTCTGC TTCCCGAGCT CTATAAAAGA GCTCACAACC CCTCACTCGG CGCGCCAGTC 78 0 

CTCCGATAGA CTGAGTCGCC CGGGT ACCCG TGTATCCAAT AAATCCTCTT GCTGTTGCAT 840 

CCGACTCGTG GTCTCGCTGT TCCTTGGGAG GGTCTCCTCA GAGTGATTGA CTACCCGTCT 900 

CGGGGGTCTT TCATTTGGGG GCTCGTCCGG GATCTGGAGA CCCCTGCCCA GGGACCACCG 9 60 

ACCCACCACC GGGAGGTAAG CTGGCCAAGA TCCCCCGGGC TGCAGGAATT TATGAAATCC 102 0 

TTTATGGGGG ACCCCCCCCT TTGTCAACCT TGCTCAATTC CTTCTCCCCC TCCGATCCTA 1080 

AGACTGATTT ACAAGCCCGA CTAAAAGGGC TGCAAGGCGT GCAGGCCCAA ATCTGGACAC 1140 

CCCTGGCCGA ATTGTACCGG CCAGGACATC CACAAACTAG CCACCCATTT CAGGTGGGAG 1200 

ACTCCGTGTA CGTCCGGCGG CACCGCTCTC AAGGATTGGA GCCTCGTTGG AAGGGACCTT 1260 

ACATCGTCCT GCTGACCACG CCCACCGCCA TAAAGGTTGA CGGGATCGCC GCCTGGATTC 132 0 

ACGCATCGCA CGCCAAGGCA GCCCCAAAAA CCCCTGGACC AGAAACTCCC AAAACCTGGA 13 80 

AGCTCCGCCG TTCGGAGAAC CCTCTTAAGA TAAGACTCTC CCGTGTCTGA CTGCTAATCC 1440 

ACCTTGTCCC TGTACTAACC CAAAATGAAA CTCCCAACAG GAATGGTCAT TTTATGTAGC 150 0 

CTAATAATAG TTCGGGCAGG GTTTGACGAC CCCCGCAAGG CTATCGCATT AGTACAAAAA 1560 

CAACATGGTA AACCATGCGA ATGCAGCGGA GGGCAGGTAT CCGAGGCCCC ACCGAACTCC 1620 

ATCCAACAGG TAACTTGCCC AGGCAAGACG GCCTACTTAA TGACCAACCA AAAATGGAAA 1680 

TGCAGAGTCA CTCCAAAAAT CTCACCTAGC GGGGGAGAAC TCCAGAACTG CCCCTGTAAC 1740 

ACTTTCCAGG ACTCGATGCA CAGTTCTTGT TATACTGAAT ACCGGCAATG CAGGCGAATT 1800 

AATAAGACAT ACTACACGGC CACCTTGCTT AAAATACGGT CTGGGAGCCT CAACGAGGTA 1860 

CAGATATTAC AAAACCCCAA TCAGCTCCTA CAGTCCCCTT GTAGGGGCTC TATAAATCAG 1920 

CCCGTTTGCT GGAGTGCCAC AGCCCCCATC CATATCTCCG ATGGTGGAGG ACCCCTCGAT 1980 

ACTAAGAGAG TGTGGACAGT CCAAAAAAGG CTAGAACAAA TTCATAAGGC TATGACTCCT 2040 

GAACTTCAAT ACCACCCCTT AGCCCTGCCC AAAGTCAGAG ATGACCTTAG CCTTGATGCA 2100 

CGGACTTTTG ATATCCTGAA TACCACTTTT AGGTTACTCC AGATGTCCAA TTTTAGCCTT 2160 

GCCCAAGATT GTTGGCTCTG TTTAAAACTA GGTACCCCTA CCCCTCTTGC GATACCCACT 2220 

CCCTCTTTAA CCTACTCCCT AGCAGACTCC CTAGCGAATG CCTCCTGTCA GATTATACCT 2280 

CCCCTCTTGG TTCAACCGAT GCAGTTCTCC AACTCGTCCT GTTTATCTTC CCCTTTCATT 2340 

AACGATACGG AACAAATAGA CTTAGGTGCA GTCACCTTTA CTAACTGCAC CTCTGTAGCC 2400 

AATGTCAGTA GTCCTTTATG TGCCCTAAAC GGGTCAGTCT TCCTCTGTGG AAATAACATG 2460 

GCATACACCT ATTTACCCCA AAACTGGACC AGACTTTGCG TCCAAGCCTC CCTCCTCCCC 2520 

GACATTGACA TCAACCCGGG GGATGAGCCA GTCCCCATTC CTGCCATTGA TCATTATATA 25 80 

CATAGACCTA AACGAGCTGT ACAGTTCATC CCTTTACTAG CTGGACTGGG AATCACCGCA 2640 

GCATTCACCA CCGGAGCTAC AGGCCTAGGT GTCTCCGTCA CCCAGTATAC AAAATTATCC 2700 

CATCAGTTAA TATCTGATGT CCAAGTCTTA TCCGGTACCA TACAAGATTT ACAAGACCAG 2760 

GTAGACTCGT TAGCTGAAGT AGTTCTCCAA AATAGGAGGG GACTGGACCT ACTAACGGCA 2820 

GAACAAGGAG GAATTTGTTT AGCCTTACAA GAAAAATGCT GTTTTTATGC TAACAAGTCA 2880 

GGAATTGTGA GAAACAAAAT AAGAACCCTA CAAGAAGAAT TACAAAAACG CAGGGAAAGC 2940 

CTGGCAACCA ACCCTCTCTG GACCGGGCTG CAGGGCTTTC TTCCGTACCT CCTACCTCTC 3000 

CTGGGACCCC TACTCACCCT CCTACTCATA CTAACCATTG GGCCATGCGT TTTCAGTCGC 3060 

CTCATGGCCT TCATTAATGA TAGACTTAAT GTTGTACATG CCATGGTGCT GGCCCAGCAA 3120 

TACCAAGCAC TCAAAGCTGA GGAAGAAGCT CAGGATTGAG GCGCCTAGTG TTGACAATTA 3180 

ATCATCGGCA TAGTATACGG CATAGTATAA TACGACTCAC TATAGGAGGG CCACCATGGC 3240 

CAAGTTGACC AGTGCCGTTC CGGTGCTCAC CGCGCGCGAC GTCGCCGGAG CGGTCGAGTT 33 00 

CTGGACCGAC CGGCTCGGGT TCTCCCGGGA CTTCGTGGAG GACGACTTCG CCGGTGTGGT 33 60 

CCGGGACGAC GTGACCCTGT TCATCAGCGC GGTCCAGGAC CAGGTGGTGC CGGACAACAC 3420 

CCTGGCCTGG GTGTGGGTGC GCGGCCTGGA CGAGCTGTAC GCCGAGTGGT CGGAGGTCGT 3480 

GTCCACGAAC TTCCGGGACG CCTCCGGGCC GGCCATGACC GAGATCGGCG AGCAGCCGTG 3 540 

GGGGCGGGAG TTCGCCCTGC GCGACCCGGC CGGCAACTGC GTGCACTTCG TGGCCGAGGA 3 600 

GCAGGACTGA NNNNCGGACC GGTCGACTTG TTAACTTGTT TATTGCAGCT TATAATGGTT 3660 

ACAAATAAAG CAATAGCATC ACAAATTTCA CAAATAAAGC ATTTTTTTCA CTGCATTCTA 3720 

GTTGTGGTTT GTCCAAACTC ATCAATGTAT CTTATCATGT CTGGATCCAG ATCTGGGCCC 3780 

ATGCGGCCGC GGATCGATNN NNACATGTGA GCAAAAGGCC AGCAAAAGGC CAGGAACCGT 3 840 

AAAAAGGCCG CGTTGCTGGC GTTTTTCCAT AGGCTCCGCC CCCCTGACGA GCATCACAAA 3 900 

AATCGACGCT CAAGTCAGAG GTGGCGAAAC CCGACAGGAC TATAAAGATA CCAGGCGTTT 3 960 

CCCCCTGGAA GCTCCCTCGT GCGCTCTCCT GTTCCGACCC TGCCGCTTAC CGGATACCTG 4020 

TCCGCCTTTC TCCCTTCGGG AAGCGTGGCG CTTTCTCAAT GCTCACGCTG TAGGTATCTC 4080 
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^^S T AGGTC ^TTCG CTCCAAGCTG GGCTGTGTGC ACGAACCCCC CGTTCAGCCC 
S^CTGCG CCTTATGCGG TAACTATCGT CTTGAGTCCA ACCCGGTAAG aScScSS 

£* GCAGCCAC TGGTAACAGG ATTAGCAGAG CGAGGTATGT AgSSS 
^ AGA ^" TGAAGTGGTG GCCTAACTAC GGCTACACTA GAAGGACAGT ATTTGctS 
TGCGCTCTGC TGAAGCCAGT TACCTTCGGA AAAAGAGTTG GTAGCTCTTG ATCCGGCAAA 
™™^ G CTGGTAGGGG TGGTTTTTTT GTTTGCAAGC AGCAgSS SSSS^ 

^ GAAGATCC TTTGATCTTT TCTACGGGGT CTGACGCTCA GtSS^aS 

^ G ;^ GGTT ^gggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcc^ 

TTAAATTAAA AATGAAGTTT TAAATCAATC TAAAGTATAT ATGAGTAAAC TTGGTCTGAr 
AGTTACCAAT GCTTAATCAG TGAGGCACCT ATCTCAGCGA TCTCTCW J^cItcC 
ATAGTTGCCT GACTCCCCGT CGTGTAGATA ACTACGATAC GGGAGGGCTT ACCATCTGGC 
CCCAGTGCTG CAATGATACC GCGAGACCCA CGCTCACCGG CTCCAGATTT ATCAGcSa 
AACCAGCCAG CCGGAAGGGC CGAGCGCAGA AGTGGTCCTG CAACTTtSc CGCCTC^tJ 

cagtctatta attgttgccg ggaagctaga gtaagtagtt cgccagttS tagtttcJgc 
aacgttgttg ccattgctac aggcatcgtg gtgtcacgct cgtcgtSgg tatcgSS 

™5SS^S GTTCCCAACG ATCAAGGCGA GTTACATGAT CCCCCATCTT G^SIIS 
G £ GG ;^SS T CCTTCGGTCG TCCGATCGTT GTCAGAAGTA AGTTGGCCGC AGTGTTATCA 
G IS™^ TGGCAGCACT GCATAATTCT CTTACTGTCA TGCCATCCGT aIgATGCtS 

JgC^GCC ^^ TCA TTCTGAGAAT AGTGTATGCG SSSSS? 

^ GG ; GTTGCC CG GCGTCAAT ACGGGATAAT ACCGCGCCAC ATAGCAGAAC TTTAAAAGTG 
^r^t GAAAACGTTC TTCGGGGCGA AAACTCTCAA GGATCTTACC GCTGTTGAGA 
T GG ^ GGA TGTAACCCAC TCGTGCACCC AACTGATCTT CAGCATCTTT TACTTTCACC 
AGCGTTTCTG GGTGAGCAAA AACAGGAAGG CAAAATGCCG CAAAAAAGGG AATAAGGGCG 
ACACGGAAAT GTTGAATACT CATACTCTTC CTTTTTCAAT ATTATTGAAG cimSSS 

SttccgSS SSJSSS *T acatattt gaa tgtattt aSJESX acISS 

GTTCCGCGCA CATTTCCCCG AAAAGTGCCA CCTGACGTCT AAGAAACCAT TATTATCATG 
™JCT ATAAAAATAG GCGTATCACG AGGCCCTTTC GTCTCGCGCG TTT^JS 
^Sr^^t AGCTCTGACA CATGCAGCTC CCGGAGACGG TCACAGCTTG TCTGTAAGCG 
GA ^ GGGGA GCAGACAAGC CCGTCAGGGC GCGTCAGCGG GTGTTGGCGG GTGTCGGGGC 
TGGCTTAACT ATGCGGCATC AGAGCAGATT GTACTGAGAG TGCAC tjlGTCGGGGC 
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AGATCTCCCG ATCCCCTATG GTCGACTCTC AGTACAATCT GCTCTGATGC CGCATAGTTA 60 
AGCCAGTATC TGCTCCCTGC TTGTGTGTTG GAGGTCGCTG AGTAGTGCGC GAGCAAAATT " 12 0 

TAAGCTACAA CAAGGCAAGG CTTGACCGAC AATTGCATGA AGAATCTGCT TAGGGTTAGG 180 

CGTTTTGCGC TGCTTCGCGA TGTACGGGCC AGATATACGC GTTGACATTG ATTATTGACT 2 40 

AGTTATTAAT AGTAATCAAT TACGGGGTCA TTAGTTCATA GCCCATATAT GGAGTTCCGC 300 

GTTACATAAC TTACGGTAAA TGGCCCGCCT GGCTGACCGC CCAACGACCC CCGCCCATTG 3 60 

ACGTCAATAA TGACGTATGT TCCCATAGTA ACGCCAATAG GGACTTTCCA TTGACGTCAA 420 

TGGGTGGACT ATTTACGGTA AACTGCCCAC TTGGCAGTAC ATCAAGTGTA TCATATGCCA 480 

AGTACGCCCC CTATTGACGT CAATGACGGT AAATGGCCCG CCTGGCATTA TGCCCAGTAC 540 

ATGACCTTAT GGGACTTTCC TACTTGGCAG TACATCTACG TATTAGTCAT CGCTATTACC 600 

ATGGTGATGC GGTTTTGGCA GTACATCAAT GGGCGTGGAT AGCGGTTTGA CTCACGGGGA 6 60 

TTTCCAAGTC TCCACCCCAT TGACGTCAAT GGGAGTTTGT TTTGGCACCA AAATCAACGG 720 

GACTTTCCAA AATGTCGTAA CAACTCCGCC CCATTGACGC AAATGGGCGG TAGGCGTGTA 780 

CGGTGGGAGG TCTATATAAG CAGAGCTCTC TGGCTAACTA GAGAACCCAC TGCTTAACTG 840 

GCTTATCGAA ATGTCGACTG AGAACTTCAG GGTGAGTTTG GGGACCCTTG ATTGTTCTTT 900 

CTTTTTCGCT ATTGTAAAAT TCATGTTATA TGGAGGGGGC AAAGTTTTCA GGGTGTTGTT 9 60 

TAGAATGGGA AGATGTCCCT TGTATCACCA TGGACCCTCA TGATAATTTT GTTTCTTTCA 102 0 

CTTTCTACTC TGTTGACAAC CATTGTCTCC TCTTATTTTC TTTTCATTTT CTGTAACTTT 1080 

TTCGTTAAAC TTTAGCTTGC ATTTGTAACG AATTTTTAAA TTCACTTTTG TTTATTTGTC 1140 

AGATTGTAAG TACTTTCTCT AATCACTTTT TTTTCAAGGC AATCAGGGTA TATTATATTG 1200 

TACTTCAGCA CAGTTTTAGA GAACAATTGT TATAATTAAA TGATAAGGTA GAATATTTCT 1260 

GCATATAAAT TCTGGCTGGC GTGGAAATAT TCTTATTGGT AGAAACAACT ACATCCTGGT 1320 

CATCATCCTG CCTTTCTCTT TATGGTTACA ATGATATACA CTGTTTGAGA TGAGGATAAA 13 80 

ATACTCTGAG TCCAAACCGG GCCCCTCTGC TAACCATGTT CATGCCTTCT TCTTTTTCCT 1440 

ACAGCTCCTG GGCAACGTGC TGGTTGTTGT GCTGTCTCAT CATTTTGGCA AGGATCGGCC 1500 

GGAACAGCAT CAGGACCGAC ATGGAAGGTC CAGCGTTCTC AAAACCCCTT AAAGATAAGA 15 60 

TTAACCCGTG GAAGTCCTTA ATGGTCATGG GGGTCTATTT AAGAGTAGGG ATGGCAGAGA 1620 

GCCCCCATCA GGTCTTTAAT GTAACCTGGA GAGTCACCAA CCTGATGACT GGGCGTACCG 1680 

CCAATGCCAC CTCCCTTTTA GGAACTGTAC AAGATGCCTT CCCAAGATTA TATTTTGATC 1740 

TATGTGATCT GGTCGGAGAA GAGTGGGACC CTTCAGACCA GGAACCATAT GTCGGGTATG 1800 

GCTGCAAATA CCCCGGAGGG AGAAAGCGGA CCCGGACTTT TGACTTTTAC GTGTGCCCTG 18 60 

GGCATACCGT AAAATCGGGG TGTGGGGGGC CAAGAGAGGG CTACTGTGGT GAATGGGGTT 192 0 

GTGAAACCAC CGGACAGGCT TACTGGAAGC CCACATCATC ATGGGACCTA ATCTCCCTTA 1980 

AGCGCGGTAA CACCCCCTGG GACACGGGAT GCTCCAAAAT GGCTTGTGGC CCCTGCTACG 2040 

ACCTCTCCAA AGTATCCAAT TCCTTCCAAG GGGCTACTCG AGGGGGCAGA TGCAACCCTC 2100 

TAGTCCTAGA ATTCACTGAT GCAGGAAAAA AGGCTAATTG GGACGGGCCC AAATCGTGGG 2160 

GACTGAGACT GTACCGGACA GGAACAGATC CTATTACCAT GTTCTCCCTG ACCCGCCAGG 2220 

TCCTCAATAT AGGGCCCCGC ATCCCCATTG GGCCTAATCC CGTGATCACT GGTCAACTAC 2280 

CCCCCTCCCG ACCCGTGCAG ATCAGGCTCC CCAGGCCTCC TCAGCCTCCT CCTACAGGCG 2340 

CAGCCTCTAT AGTCCCTGAG ACTGCCCCAC CTTCTCAACA ACCTGGGACG GGAGACAGGC 2 400 

TGCTAAACCT GGTAGAAGGA GCCTATCAGG CGCTTAACCT CACCAATCCC GACAAGACCC 2 460 

AAGAATGTTG GCTGTGCTTA GTGTCGGGAC CTCCTTATTA CGAAGGAGTA GCGGTCGTGG 2520 

GCACTTATAC CAATCATTCT ACCGCCCCGG CCAGCTGTAC GGCCACTTCC CAACATAAGC 2 580 

TTACCCTATC TGAAGTGACA GGACAGGGCC TATGCATGGG AGCACTACCT AAAACTCACC 2640 

AGGCCTTATG TAACACCACC CAAAGTGCCG GCTCAGGATC CTACTACCTT GCAGCACCCG 2700 

CTGGAACAAT GTGGGCTTGT AGCACTGGAT TGACTCCCTG CTTGTCCACC ACGATGCTCA 2760 

ATCTAACCAC AGACTATTGT GTATTAGTTG AGCTCTGGCC CAGAATAATT TACCACTCCC 2 820 

CCGATTATAT GTATGGTCAG CTTGAACAGC GTACCAAATA TAAGAGGGAG CCAGTATCGT 2880 

TGACCCTGGC CCTTCTGCTA GGAGGATTAA CCATGGGAGG GATTGCAGCT GGAATAGGGA 2 940 

CGGGGACCAC TGCCCTAATC AAAACCCAGC AGTTTGAGCA GCTTCACGCC GCTATCCAGA 3000 

CAGACCTCAA CGAAGTCGAA AAATCAATTA CCAACCTAGA AAAGTCACTG ACCTCGTTGT 30 60 

CTGAAGTAGT CCTACAGAAC CGAAGAGGCC TAGATTTGCT CTTCCTAAAA GAGGGAGGTC 312 0 

TCTGCGCAGC CCTAAAAGAA GAATGTTGTT TTTATGCAGA CCACACGGGA CTAGTGAGAG 3180 

ACAGCATGGC CAAACTAAGG GAAAGGCTTA ATCAGAGACA AAAACTATTT GAGTCAGGCC 3240 

AAGGTTGGTT CGAAGGGCAG TTTAATAGAT CCCCCTGGTT TACCACCTTA ATCTCCACCA 3 3 00 

TCATGGGACC TCTAATAGTA CTCTTACTGA TCTTACTCTT TGGACCCTGC ATTCTCAATC 33 60 

GATTAGTTCA ATTTGTTAAA GACAGGATCT CAGTAGTCCA GGCTTTAGTC CTGACTCAAC 3 42 0 

AATACCACCA GCTAAAGCCT ATAGAGTACG AGCCATAGGG CGCCTAGTGT TGACAATTAA 3480 

TCATCGGCAT AGTATACGGC ATAGTATAAT ACGACTCACT ATAGGAGGGC CACCATGGCC 3 540 

AAGTTGACCA GTGCCGTTCC GGTGCTCACC GCGCGCGACG TCGCCGGAGC GGTCGAGTTC 3 600 

TGGACCGACC GGCTCGGGTT CTCCCGGGAC TTCGTGGAGG ACGACTTCGC CGGTGTGGTC 3 660 

CGGGACGACG TGACCCTGTT CATCAGCGCG GTCCAGGACC AGGTGGTGCC GGACAACACC 3720 

CTGGCCTGGG TGTGGGTGCG CGGCCTGGAC GAGCTGTACG CCGAGTGGTC GGAGGTCGTG 3.780 

TCCACGAACT TCCGGGACGC CTCCGGGCCG GCCATGACCG AGATCGGCGA GCAGCCGTGG 3840 

GGGCGGGAGT TCGCCCTGCG CGACCCGGCC GGCAACTGCG TGCACTTCGT GGCCGAGGAG 3 900 

CAGGACTGAN NNNCGGACCG GTCGA 3 925 
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